$$ \newcommand{\esp}[1]{\mathbb{E}\left(#1\right)} \newcommand{\var}[1]{\mbox{Var}\left(#1\right)} \newcommand{\deriv}[1]{\dot{#1}(t)} \newcommand{\prob}[1]{ \mathbb{P}\!(#1)} \newcommand{\eqdef}{\mathop{=}\limits^{\mathrm{def}}} \newcommand{\by}{\boldsymbol{y}} \newcommand{\bc}{\boldsymbol{c}} \newcommand{\bpsi}{\boldsymbol{\psi}} \def\pmacro{\texttt{p}} \def\like{{\cal L}} \def\llike{{\cal LL}} \def\logit{{\rm logit}} \def\probit{{\rm probit}} \def\one{{\rm 1\!I}} \def\iid{\mathop{\sim}_{\rm i.i.d.}} \def\simh0{\mathop{\sim}_{H_0}} \def\df{\texttt{df}} \def\res{e} \def\xomega{x} \newcommand{\argmin}[1]{{\rm arg}\min_{#1}} \newcommand{\argmax}[1]{{\rm arg}\max_{#1}} \newcommand{\Rset}{\mbox{$\mathbb{R}$}} \def\param{\theta} \def\setparam{\Theta} \def\xnew{x_{\rm new}} \def\fnew{f_{\rm new}} \def\ynew{y_{\rm new}} \def\nnew{n_{\rm new}} \def\enew{e_{\rm new}} \def\Xnew{X_{\rm new}} \def\hfnew{\widehat{\fnew}} \def\degree{m} \def\nbeta{d} \newcommand{\limite}[1]{\mathop{\longrightarrow}\limits_{#1}} \def\ka{k{\scriptstyle a}} \def\ska{k{\scriptscriptstyle a}} \def\kel{k{\scriptstyle e}} \def\skel{k{\scriptscriptstyle e}} \def\cl{C{\small l}} \def\Tlag{T\hspace{-0.1em}{\scriptstyle lag}} \def\sTlag{T\hspace{-0.07em}{\scriptscriptstyle lag}} \def\Tk{T\hspace{-0.1em}{\scriptstyle k0}} \def\sTk{T\hspace{-0.07em}{\scriptscriptstyle k0}} \def\thalf{t{\scriptstyle 1/2}} \newcommand{\Dphi}[1]{\partial_\pphi #1} \def\asigma{a} \def\pphi{\psi} \newcommand{\stheta}{{\theta^\star}} \newcommand{\htheta}{{\widehat{\theta}}} $$



1 Linear mixed effects models

1.1 The Pastes data

  1. Check the documentation, the structure and a summary of the Pastes data from the lme4 package.

  2. Build a model for this data

  3. What is the main cause of the variability of the paste strength?


1.2 The orange data

  1. Check the documentation, the structure and a summary of the Orange data.

  2. Plot this data

  3. Fit a linear model to this data

  4. Fit different linear mixed effect models to this data and compare them

  5. Compute confidence intervals on the parameters, compute the individual parameters and plot the random effects for the ``best?????? model

  6. Compare the ML and the REML estimates for this model


1.3 The Oats data

  1. Check the documentation, the structure and a summary of the Oats data from the nlme package.

  2. Plot the data in such a way as to visualize the effect of the fertilizer-concentration on the yield as well as possible differences between blocks and varieties.

  3. Fit the Oats data with linear mixed effects model, assuming a linear effect of the fertilizer concentration on the yield.


2 Nonlinear mixed effects models

S??ralini et al. published in 2007 the paper ???New Analysis of a Rat Feeding Study with a Genetically Modified Maize Reveals Signs of Hepatorenal Toxicity???. The authors of the paper pretend that, after the consumption of MON863, rats showed slight but dose-related significant variations in growth.

The objective of this exercise is to highlight the flaws in the methodology used to achieve this result, and show how to properly analyse the data.

We will restrict our study to the male rats of the study fed with 11% of maize (control and GMO)

  1. Load the ratWeight.csv data, select the male rats fed with 11% of maize and plot the growth curves of the control and GMO groups.

  2. Fit a Gompertz growth model \(f_1(t) = A \exp(-\exp(-b(t-c)))\) to the complete data (males fed with 11% of maize) using a least square approach, with the same parameters for the control and GMO groups.

  3. Fit a Gompertz growth model to the complete data (11% male) using a least square approach, with different parameters for the control and GMO groups.

Hint: write the model as \[ y_{ij} = A_{0} e^{-e^{-b_0 (t_{ij}-c_0)}} \one_{{\rm regime}_i={\rm Control}} + A_{1} e^{- e^{-b_1 (t_{ij}-c_1)}} \one_{{\rm regime}_i={\rm GMO}} + e_{ij}\]

  1. Check out the results of the paper displayed Table 1, for the 11% males.

  2. Plot the residuals and explain why the results of the paper are wrong.

  3. We popose to use instead a mixed effects model for testing the effect of the regime on the growth of the 11% male rats. The codes below show how to fit a Gompertz model to the data

    • assuming the same population parameters for the two regime groups,
    • using lognormal distributions for the 3 parameters (setting transform.par=c(1,1,1))
    • assuming a diagonal covariance matrix \(\Omega\) (default)

Create first the saemixData object

library(saemix)

data <- read.csv("ratWeight.csv")
data.male11 <- subset(data, gender=="Male" & dosage=="11%")

saemix.data<-saemixData(name.data=data.male11, 
                        name.group=c("id"),
                        name.predictors=c("week"),
                        name.response=c("weight"))

Implement then the structural model and create the saemixModel object. Initial values for the population parameters should be provided.

gompertz.model <- function(psi,id,x) { 
  t <- x[,1]
  A<-psi[id,1]
  b<-psi[id,2]
  c<-psi[id,3]
  ypred<- A*exp(-exp(-b*(t-c)))
  return(ypred)
}

saemix.gompertz.model0<-saemixModel(model=gompertz.model,
                          psi0=c(A=500,b=0.2,c=0.2),
                          transform.par=c(1,1,0))

Run saemix for estimating the population parameters, computing the individual estimates, computing the FIM and the log-likelihood (linearization)

saemix.options<-list(seed=632545, displayProgress=FALSE, algorithms=c(1,1,0))
saemix.gompertz.fit0 <- saemix(saemix.gompertz.model0,saemix.data,saemix.options)
summary(saemix.gompertz.fit0)
## ----------------------------------------------------
## -----------------  Fixed effects  ------------------
## ----------------------------------------------------
##   Parameter Estimate     SE  CV(%)
## 1         A  529.069 7.9284   1.50
## 2         b    0.214 0.0065   3.06
## 3         c    0.041 0.0820 199.02
## 4         a   12.297 0.4101   3.33
## ----------------------------------------------------
## -----------  Variance of random effects  -----------
## ----------------------------------------------------
##   Parameter Estimate     SE CV(%)
## A  omega2.A   0.0081 0.0019 24.04
## b  omega2.b   0.0221 0.0080 36.01
## c  omega2.c   0.2068 0.0588 28.45
## ----------------------------------------------------
## ------  Correlation matrix of random effects  ------
## ----------------------------------------------------
##          omega2.A omega2.b omega2.c
## omega2.A 1.00     0.00     0.00    
## omega2.b 0.00     1.00     0.00    
## omega2.c 0.00     0.00     1.00    
## ----------------------------------------------------
## ---------------  Statistical criteria  -------------
## ----------------------------------------------------
## Likelihood computed by linearisation
##       -2LL= 4703.692 
##       AIC = 4717.692 
##       BIC = 4729.514 
## ----------------------------------------------------

Display some diagnostic plots:

#Individual predictions
saemix.gompertz.fit0 <- predict(saemix.gompertz.fit0)

# Individual plot for subject 1 to 9, 
saemix.plot.fits(saemix.gompertz.fit0,ilist=c(1:9),smooth=TRUE)

# Diagnostic plot: observations versus population predictions
saemix.plot.obsvspred(saemix.gompertz.fit0,level=1)

# Scatter plot of residuals
saemix.plot.scatterresiduals(saemix.gompertz.fit0, level=1)

Correlation matrix of the estimates:

fim <- -saemix.gompertz.fit0@results@fim #Fisher information matrix
cov.est <- solve(fim) # covariance matrix of the estimates
d <- sqrt(diag(cov.est)) # s.e. of the estimates
cov.est/(d%*%t(d)) # correlation matrix of the estimates
##            [,1]       [,2]       [,3]         [,4]        [,5]
## [1,]  1.0000000 -0.1846297 -0.0207690  0.000000000  0.00000000
## [2,] -0.1846297  1.0000000  0.1447597  0.000000000  0.00000000
## [3,] -0.0207690  0.1447597  1.0000000  0.000000000  0.00000000
## [4,]  0.0000000  0.0000000  0.0000000  1.000000000 -0.03431771
## [5,]  0.0000000  0.0000000  0.0000000 -0.034317707  1.00000000
## [6,]  0.0000000  0.0000000  0.0000000  0.000503899 -0.02306556
## [7,]  0.0000000  0.0000000  0.0000000 -0.010303060 -0.09232785
##              [,6]        [,7]
## [1,]  0.000000000  0.00000000
## [2,]  0.000000000  0.00000000
## [3,]  0.000000000  0.00000000
## [4,]  0.000503899 -0.01030306
## [5,] -0.023065562 -0.09232785
## [6,]  1.000000000 -0.05432513
## [7,] -0.054325132  1.00000000

Fit the same model to the same data, assuming different population parameters for the control and GMO groups. Can we conclude that the regime has an effect on the growth of the 11% male rats?

  1. Use an asymptotic regression model \(f(t) = w_{\infty} + (w_0 -w_{\infty})e^{-k\,t}\) to test the effect of the regime on the growth of the 11% male rats.

  2. Should we accept the hypothesis that the random effects are uncorrelated?

  3. In conclusion, what is the ``best?????? model to fit this data?