1. Example 1: Smoking prevention and cessation study

1600 students are collected from 135 classrooms in 28 schools. We want to understand the effect of SC (exposure to a school-based curriculum or not), TV (exposure to a television-based prevention program or not) and previous THK scale on the current THK scale. We have \(1600\) samples, but some share the same school and some share the same classroom.

The multilevel model: \[y_{ics} = \beta_0 + \beta_1\text{PTHK}_{ics} + \beta_2\text{SC}_{ics} + \beta_3\text{TV}_{ics} + u_s + v_{cs} + \epsilon_{ics}\]

Smoking <- read.table("Smoking.dat", header = T)
Smoking

The LMM can be solved using R package lme4

library(lme4)

## Loading required package: Matrix

fit <- lmer(y ~ PTHK + SC + TV + (1|school) + (1|class), data = Smoking) # Fitting the variances is by REML
summary(fit)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ PTHK + SC + TV + (1 | school) + (1 | class)
##    Data: Smoking
## 
## REML criterion at convergence: 5374.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.5202 -0.6975 -0.0177  0.6875  3.1630 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.06853  0.2618  
##  school   (Intercept) 0.03925  0.1981  
##  Residual             1.60108  1.2653  
## Number of obs: 1600, groups:  class, 135; school, 28
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  1.78493    0.11295  15.803
## PTHK         0.30524    0.02590  11.786
## SC           0.47147    0.11330   4.161
## TV           0.01956    0.11330   0.173
## 
## Correlation of Fixed Effects:
##      (Intr) PTHK   SC    
## PTHK -0.493              
## SC   -0.503  0.025       
## TV   -0.521  0.015 -0.002

The random effects are small compared to the noise, but as you will see later it can still substantially increase the variance of \(\hat\beta\).
Testing for the variance of the random effect terms is complicated. A standard likelihood ratio test would not work.
Testing for the fixed effects is the same as in classical LM or GLM models.

We can check the “predicted” value for each random effect and response

rand.eff <- ranef(fit) ## predict random effects for each school / class ## method: BLUP
pred <- predict(fit) ## fit each observation
rand.eff$school

Estimated intra-class correlation \[\text{corr}(y_{ics}, y_{i'cs}) = \frac{\hat\sigma_u^2 + \hat\sigma_v^2}{\hat\sigma_u^2 + \hat\sigma_v^2 + \hat\sigma_e^2} = \frac{0.039 + 0.069}{0.039 + 0.069 + 1.60} = 0.063\]

Estimated intra-school correlation \[\text{corr}(y_{ics}, y_{i'c's}) = \frac{\hat\sigma_u^2 }{\hat\sigma_u^2 + \hat\sigma_v^2 + \hat\sigma_e^2} = \frac{0.039}{0.039 + 0.069 + 1.60} = 0.023\] Both correlations are very small.

If we ignore the random effects

summary(lm(y ~ PTHK + SC + TV, data = Smoking))

## 
## Call:
## lm(formula = y ~ PTHK + SC + TV, data = Smoking)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5635 -0.9130 -0.0626  0.8981  4.2416 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.73734    0.07866  22.088  < 2e-16 ***
## PTHK         0.32525    0.02589  12.561  < 2e-16 ***
## SC           0.47987    0.06529   7.350 3.15e-13 ***
## TV           0.04534    0.06518   0.696    0.487    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.303 on 1596 degrees of freedom
## Multiple R-squared:  0.1136, Adjusted R-squared:  0.112 
## F-statistic: 68.21 on 3 and 1596 DF,  p-value: < 2.2e-16

The standard errors of SC and TV become much smaller, though the pairwise correlations among classes and schools introduced by the random effect terms are small. The explanation is that, though each pairwise correlation is small, as all pairs of classes / schools have such correlations if we have many classes / schools, the aggregation of the variance inflation would be still be large enough to affect the variance of \(\hat\beta\).

2. Example 2 (Faraway Chapter 10.1-10.5)

The data comes from an experiment to test the paper brightness depending on a shift operator describbed in Sheldon (1960). 5 Samples are tested on 4 different operators.

data(pulp, package = "faraway")
pulp

Visualize the data:

library(ggplot2)

## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'tibble'

## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'pillar'

ggplot(pulp, aes(x = operator, y = bright)) + geom_point(position = position_jitter(width = 0.1, height = 0.0))

Linear regression

\[y_{ij} = \mu + \alpha_i + \epsilon_{ij}\] with \(\sum \alpha_i = 0\).

op <- options(contrasts = c("contr.sum", "contr.poly")) ## make sure that \sum \alpha_i = 0
model.matrix(~operator, data = pulp)

##    (Intercept) operator1 operator2 operator3
## 1            1         1         0         0
## 2            1         1         0         0
## 3            1         1         0         0
## 4            1         1         0         0
## 5            1         1         0         0
## 6            1         0         1         0
## 7            1         0         1         0
## 8            1         0         1         0
## 9            1         0         1         0
## 10           1         0         1         0
## 11           1         0         0         1
## 12           1         0         0         1
## 13           1         0         0         1
## 14           1         0         0         1
## 15           1         0         0         1
## 16           1        -1        -1        -1
## 17           1        -1        -1        -1
## 18           1        -1        -1        -1
## 19           1        -1        -1        -1
## 20           1        -1        -1        -1
## attr(,"assign")
## [1] 0 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$operator
## [1] "contr.sum"

lmod <- lm(bright~operator, data = pulp)
summary(lmod)

## 
## Call:
## lm(formula = bright ~ operator, data = pulp)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.440 -0.195 -0.070  0.175  0.560 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 60.40000    0.07289 828.681   <2e-16 ***
## operator1   -0.16000    0.12624  -1.267    0.223    
## operator2   -0.34000    0.12624  -2.693    0.016 *  
## operator3    0.22000    0.12624   1.743    0.101    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.326 on 16 degrees of freedom
## Multiple R-squared:  0.4408, Adjusted R-squared:  0.3359 
## F-statistic: 4.204 on 3 and 16 DF,  p-value: 0.02261

\(\alpha_4 = 0.16 + 0.34 - 0.22 = 0.28\). We calculate sample variance of \(\hat\alpha_i\)

((-0.16)^2 + (-0.34)^2 + 0.22^2 + 0.28^2)/(4-1)

## [1] 0.08933333

Using LMM assuming that \(\alpha_i \sim N(0, \sigma^2)\).

library(lme4)
mmod <- lmer(bright ~ 1 + (1|operator), pulp)
summary(mmod)

## Linear mixed model fit by REML ['lmerMod']
## Formula: bright ~ 1 + (1 | operator)
##    Data: pulp
## 
## REML criterion at convergence: 18.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4666 -0.7595 -0.1244  0.6281  1.6012 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  operator (Intercept) 0.06808  0.2609  
##  Residual             0.10625  0.3260  
## Number of obs: 20, groups:  operator, 4
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  60.4000     0.1494   404.2

Estimate of \(\sigma^2\) is slightly smaller than the sample variance of \(\hat\alpha_i\), why?
Standard error of the intercept increase in LMM

Prediction of each \(\alpha_i\) from LMM

ranef(mmod)$operator

alpha <- c(-0.16, -0.34, 0.22, 0.28)

The predicted \(\alpha_i\) from LMM can be viewed as a type of shrinkage estimator (towards \(0\)).

Test for group difference

We test for the null \(H_0: \sigma^2 = 0\). As mentioned earlier, the LRT provides a conservative p-value

nullmod <- lm(bright ~ 1, pulp)
smod <- lmer(bright ~ 1 + (1|operator), pulp, REML = F) ## solve the full MLE
lrtstat <- as.numeric(2*(logLik(smod) - logLik(nullmod)))
pvalue <- pchisq(lrtstat, 1, lower = F)
data.frame(lrtstat, pvalue)

Should we believe that there is no different across operators? Is the p-value too conservative? Or sample size too small?

We can use parametric bootstrap to approximate the null distribution of the LRT statistics.

lrtstat <- sapply(1:1000, function(i) {
  set.seed(i)
  y <- unlist(simulate(nullmod))
  bnull <- lm(y ~ 1)
  balt <- suppressMessages(lmer(y ~ 1 + (1|operator), pulp, REML = F))
  return(as.numeric(2*(logLik(balt) - logLik(bnull))))
})

mean(lrtstat > 2.5684)

## [1] 0.018

sqrt(0.018 * 0.982/1000)

## [1] 0.004204284

So we can be fairly sure that the p-value is under 0.05. We can use the RLRsim package to test for random effect terms.

library(RLRsim)
exactLRT(mmod, nullmod)

## No restrictions on fixed effects. REML-based inference preferable.

## Using likelihood evaluated at REML estimators.

## Please refit model with method="ML" for exact results.

## 
##  simulated finite sample distribution of LRT. (p-value based on 10000
##  simulated values)
## 
## data:  
## LRT = 2.4435, p-value = 0.0254

Model diagnosis

qqnorm(residuals(mmod), main = "")

plot(fitted(mmod), residuals(mmod), xlab = "Fitted", ylab = "Residuals")
abline(h = 0)

Random effect models are particularly sensitive to outliers, as they depend on variance components that can be substantially inflated by unusual points.

3. Example 3: nested groups (Faraway Chapter 10.8)

A large jar of dried egg power was divided up into a number of samples, and we test for consistency across lab tests. Four samples were sent to 6 labs, with two labeled as G and two labeled as H, and are tested by two different technicians in each lab.

data(eggs, package = "faraway")
eggs

The Technician “One” in different labs are different individuals.
Samples are also different in different labs.
6 different labs, each lab two technician.

We fit a model: \[y_{ijkl} = \mu + L_i + T_{ij} + S_{ijk} + \epsilon_{ijkl}\]

cmod <- lmer(Fat ~ 1 + (1|Lab) + (1|Lab:Technician) + (1|Lab:Technician:Sample), data = eggs)
summary(cmod)

## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## Fat ~ 1 + (1 | Lab) + (1 | Lab:Technician) + (1 | Lab:Technician:Sample)
##    Data: eggs
## 
## REML criterion at convergence: -64.2
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.04098 -0.46576  0.00927  0.59713  1.54276 
## 
## Random effects:
##  Groups                Name        Variance Std.Dev.
##  Lab:Technician:Sample (Intercept) 0.003065 0.05536 
##  Lab:Technician        (Intercept) 0.006980 0.08355 
##  Lab                   (Intercept) 0.005920 0.07694 
##  Residual                          0.007196 0.08483 
## Number of obs: 48, groups:  
## Lab:Technician:Sample, 24; Lab:Technician, 12; Lab, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  0.38750    0.04296   9.019

We test for the null \(H_0: \sigma_S^2 = 0\).

library(RLRsim)
## null LMM model
cmodr <- lmer(Fat ~ 1 + (1|Lab) + (1|Lab:Technician), data = eggs)
## the reduced model containing only the random effect to be tested
cmods <- lmer(Fat ~ 1 + (1|Lab:Technician:Sample), data = eggs)
exactRLRT(cmods, cmod, cmodr)

## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 1.6034, p-value = 0.1011

4. Example 4: Not nested groups (Faraway Chapter 10.9)

Frou materials (A-D) are fed into a wear-testing machine. The machine can process four samples at a time and the four samples are put at 4 different positions. There might be some differences due to the positions.

data(abrasion, package = "faraway")
abrasion

Visulize the groups: run and position and two unnested groupings.

library(ggplot2)
ggplot(abrasion, aes(x = material, y =wear, shape = run, color = position)) + geom_point(position = position_jitter(width = 0.1, height = 0)) + scale_color_grey()

We fit the model \[y_{ijk} = \beta_i + \gamma_j + \theta_k + \epsilon_{ijk}\] for material \(i\), run \(j\) and position \(k\). We treat run and position as random effect.

mmod <- lmer(wear ~ material + (1|run) + (1|position), data = abrasion)
summary(mmod)

## Linear mixed model fit by REML ['lmerMod']
## Formula: wear ~ material + (1 | run) + (1 | position)
##    Data: abrasion
## 
## REML criterion at convergence: 103
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.08973 -0.30231  0.02697  0.42254  1.21052 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  run      (Intercept)  66.90    8.179  
##  position (Intercept) 107.06   10.347  
##  Residual              61.25    7.826  
## Number of obs: 16, groups:  run, 4; position, 4
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  239.500      6.879  34.817
## material1     26.250      3.389   7.746
## material2    -19.500      3.389  -5.754
## material3      2.250      3.389   0.664
## 
## Correlation of Fixed Effects:
##           (Intr) matrl1 matrl2
## material1  0.000              
## material2  0.000 -0.333       
## material3  0.000 -0.333 -0.333

We want to test for \(H_0: \sigma_{pos}^2 = 0\) and \(H_0: \sigma_{run}^2 = 0\)

## Null model
mmodr <- lmer(wear ~ material + (1|run), data = abrasion)
## the reduced model containing only the random effect to be tested
mmodp <- lmer(wear ~ material + (1|position), data = abrasion)
exactRLRT(mmodp, mmod, mmodr)

## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 4.5931, p-value = 0.0136

exactRLRT(mmodr, mmod, mmodp)

## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 3.0459, p-value = 0.033

Both are significant.

Data example 8: LMM examples