Mixed-effects models in Julia

DocumentationCitationBuild StatusCode Coverage

This package defines linear mixed models (LinearMixedModel) and generalized linear mixed models (GeneralizedLinearMixedModel). Users can use the abstraction for statistical model API to build, fit (fit/fit!), and query the fitted models.

A mixed-effects model is a statistical model for a response variable as a function of one or more covariates. For a categorical covariate the coefficients associated with the levels of the covariate are sometimes called effects, as in "the effect of using Treatment 1 versus the placebo". If the potential levels of the covariate are fixed and reproducible, e.g. the levels for Sex could be "F" and "M", they are modeled with fixed-effects parameters. If the levels constitute a sample from a population, e.g. the Subject or the Item at a particular observation, they are modeled as random effects.

A mixed-effects model contains both fixed-effects and random-effects terms.

With fixed-effects it is the coefficients themselves or combinations of coefficients that are of interest. For random effects it is the variability of the effects over the population that is of interest.

In this package random effects are modeled as independent samples from a multivariate Gaussian distribution of the form ๐“‘ ~ ๐“(0, ๐šบ). For the response vector, ๐ฒ, only the mean of conditional distribution, ๐“จ|๐“‘ = ๐› depends on ๐› and it does so through a linear predictor expression, ๐›ˆ = ๐—๐›ƒ + ๐™๐›, where ๐›ƒ is the fixed-effects coefficient vector and ๐— and ๐™ are model matrices of the appropriate sizes,

In a LinearMixedModel the conditional mean, ๐› = ๐”ผ[๐“จ|๐“‘ = ๐›], is the linear predictor, ๐›ˆ, and the conditional distribution is multivariate Gaussian, (๐“จ|๐“‘ = ๐›) ~ ๐“(๐›, ฯƒยฒ๐ˆ).

In a GeneralizedLinearMixedModel, the conditional mean, ๐”ผ[๐“จ|๐“‘ = ๐›], is related to the linear predictor via a link function. Typical distribution forms are Bernoulli for binary data or Poisson for count data.

Version 2.0.0

Version 2.0.0 contains some user-visible changes and many changes in the underlying code.

The user-visible changes include:

  • Update formula specification to StatsModels v"0.6.2", allowing for function calls within the fixed-effects terms and for interaction terms on the left-hand side of a random-effects term.

  • Use of properties in a model in addition to extractor functions. For example, to obtain the covariance parameter, $\theta$, from a model, the recommended approach now is to access the ฮธ property, as in m.ฮธ, instead of the extractor getฮธ(m).

  • bootstrap is now named parametricbootstrap to avoid conflict with a similar name in the Bootstrap package. The bootstrap sample is returned as a Table.

  • A fit method for the abstract type MixedModel has been added. It is called as

julia> using Tables, MixedModels

julia> Dyestuff = columntable((batch = string.(repeat('A':'F', inner=5)),
       yield = [1545, 1440, 1440, 1520, 1580, 1540, 1555, 1490, 1560, 1495, 1595, 1550, 1605,
        1510, 1560, 1445, 1440, 1595, 1465, 1545, 1595, 1630, 1515, 1635, 1625, 1520, 1455,
        1450, 1480, 1445]));

julia> m1 = fit(MixedModel, @formula(yield ~ 1 + (1|batch)), Dyestuff)
    Linear mixed model fit by maximum likelihood
     yield ~ 1 + (1 | batch)
       logLik   -2 logLik     AIC        BIC    
     -163.66353  327.32706  333.32706  337.53065

        Variance components:
                  Column    Variance  Std.Dev.
     batch    (Intercept)  1388.3334 37.260347
     Residual              2451.2500 49.510100
     Number of obs: 30; levels of grouping factors: 6

     Fixed-effects parameters:
    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
                 Estimate  Std.Error  z value  P(>|z|)
    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    (Intercept)    1527.5    17.6946   86.326   <1e-99
    โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

The development of this package was supported by the Center for Interdisciplinary Research, Bielefeld (ZiF)/Cooperation Group "Statistical models for psychological and linguistic data".