$$ \newcommand{\esp}[1]{\mathbb{E}\left(#1\right)} \newcommand{\var}[1]{\mbox{Var}\left(#1\right)} \newcommand{\deriv}[1]{\dot{#1}(t)} \newcommand{\prob}[1]{ \mathbb{P}\!(#1)} \newcommand{\eqdef}{\mathop{=}\limits^{\mathrm{def}}} \newcommand{\by}{\boldsymbol{y}} \newcommand{\bc}{\boldsymbol{c}} \newcommand{\bpsi}{\boldsymbol{\psi}} \def\pmacro{\texttt{p}} \def\like{{\cal L}} \def\llike{{\cal LL}} \def\logit{{\rm logit}} \def\probit{{\rm probit}} \def\one{{\rm 1\!I}} \def\iid{\mathop{\sim}_{\rm i.i.d.}} \def\simh0{\mathop{\sim}_{H_0}} \def\df{\texttt{df}} \def\res{e} \def\xomega{x} \newcommand{\argmin}[1]{{\rm arg}\min_{#1}} \newcommand{\argmax}[1]{{\rm arg}\max_{#1}} \newcommand{\Rset}{\mbox{$\mathbb{R}$}} \def\param{\theta} \def\setparam{\Theta} \def\xnew{x_{\rm new}} \def\fnew{f_{\rm new}} \def\ynew{y_{\rm new}} \def\nnew{n_{\rm new}} \def\enew{e_{\rm new}} \def\Xnew{X_{\rm new}} \def\hfnew{\widehat{\fnew}} \def\degree{m} \def\nbeta{d} \newcommand{\limite}[1]{\mathop{\longrightarrow}\limits_{#1}} \def\ka{k{\scriptstyle a}} \def\ska{k{\scriptscriptstyle a}} \def\kel{k{\scriptstyle e}} \def\skel{k{\scriptscriptstyle e}} \def\cl{C{\small l}} \def\Tlag{T\hspace{-0.1em}{\scriptstyle lag}} \def\sTlag{T\hspace{-0.07em}{\scriptscriptstyle lag}} \def\Tk{T\hspace{-0.1em}{\scriptstyle k0}} \def\sTk{T\hspace{-0.07em}{\scriptscriptstyle k0}} \def\thalf{t{\scriptstyle 1/2}} \newcommand{\Dphi}[1]{\partial_\pphi #1} \def\asigma{a} \def\pphi{\psi} \newcommand{\stheta}{{\theta^\star}} \newcommand{\htheta}{{\widehat{\theta}}} $$


1 Introduction

1.1 The theophylline data

The theophyllineData file reports data from a study of the kinetics of the anti-asthmatic drug theophylline. Twelve subjects were given oral doses of theophylline then serum concentrations were measured at 11 time points over the next 25 hours.

theo.data=read.csv("theophyllineData.csv")
head(theo.data)
##   id time concentration weight
## 1  1 0.25          2.84   79.6
## 2  1 0.57          6.57   79.6
## 3  1 1.12         10.50   79.6
## 4  1 2.02          9.66   79.6
## 5  1 3.82          8.58   79.6
## 6  1 5.10          8.36   79.6

Here, time is the time since drug administration when the sample was drawn (h), concentration is the measured theophylline concentration (mg/L) and weight is the weight of the subject (kg).

The 12 subjects received 320 mg of theophylline at time 0.

Let us plot the data, i.e.??the concentration versus time:

library(ggplot2)
theme_set(theme_bw())
pl <- ggplot(data=theo.data, aes(x=time,y=concentration)) + geom_point(color="#993399", size=2) +
  xlab("time (h)") + ylab("concentration (mg/l)")
pl +geom_line(color="#993399", aes(group=id))

1.2 The inter-individual variability in the data

We can also plot the 12 individual concentration profiles on 12 separated plots,

pl + geom_line() + facet_wrap(~id)

The pattern is similar for the 12 individuals: the concentration first increases during the absorption phase adn then decreases during the elimination phase. Nevertheless, we clearly see some differences between these profiles which are not only due to the residual errors. In particular, we see that the patients absorb and eliminate the drug more or less rapidly.

On one hand, each individual profile will be properly described by a non linear pharmacokinetics (PK) model. See this web animation for more details.

On the other hand, a population approach and the use of mixed effects models will allow us to take into account this inter individual variability. See this web animation for more details about the population approach


2 Fitting nonlinear models to the data

2.1 Fitting a nonlinear model to a single subject

Let us consider the first subject of this study (id=1)

theo1 <- theo.data[theo.data$id==1 ,c("time","concentration")]
pl1 <- ggplot(data=theo1, aes(x=time,y=concentration)) + geom_point( color="#993399", size=3) + 
  xlab("time (h)") + ylab("concentration (mg/l)") + ylim(c(0,11))
pl1 + geom_line(color="#993399")

We may want to fit a PK model to this data \[y_j = f(t_j ,\pphi) + e_j \quad , \quad 1\leq j \leq n\] where \((y_j, 1\leq j \leq n)\) are the \(n\) PK measurements for this subject, \(f\) is the PK model, \(\pphi\) is the vector of PK parameters for this subject and \((e_j, 1\leq j \leq n)\) are residual errors.

A one compartment model with first order absorption and linear elimination to this data writes \[f(t_j ,\pphi) = \frac{ D \, \ka}{V(\ka-\kel)}\, \left( e^{- \skel \, t} - e^{- \ska \, t} \right)\] where \(\pphi=(\ka,V,\kel)\) are the PK parameters of the model and \(D\) is the amount of drug given to the patient (here, \(D=320mg\)).

Let us compute the least squares estimate of \(\pphi\) defined as \[\hat\pphi = \argmin{\pphi} \sum_{j=1}^{n} (y_{j} - f(t_{j} ,\pphi))^2\] We first need to implement the PK model:

pk.model1 <- function(psi, t){
  D  <- 320
  ka <- psi[1]
  V  <- psi[2]
  ke <- psi[3]
  f  <- D*ka/V/(ka-ke)*(exp(-ke*t)-exp(-ka*t)) 
  return(f)}

We can then use the nls function for fitting this (nonlinear) model to the data

pkm1 <- nls(concentration ~ pk.model1(psi, time), start=list(psi=c(ka=1, V=40, ke=0.1)), data=theo1)
coef(pkm1)
##        psi1        psi2        psi3 
##  1.77741125 29.39415966  0.05395458

and plot the predicted concentration \(f(t,\hat\pphi)\)

new.df <- data.frame(time=seq(0,40,by=0.2))
new.df$pred1 <- predict(pkm1, newdata=new.df)
pl1 +  geom_line(data=new.df, aes(x=time,y=pred1), colour="#339900", size=1)

2.2 Fitting a unique nonlinear model to several subjects

Instead of fitting this PK model to a single patient, we may want to fit the same model to all the patients:

\[y_{ij} = f(t_{ij} ,\pphi) + e_{ij} \quad , \quad 1\leq i \leq N \ , \ 1\leq j \leq n_i\] where \((y_{ij}, 1\leq j \leq n_i)\) are the \(n_i\) PK measurements for subject \(i\). Here, \(\pphi\) is the vector of PK parameters shared by the \(N\) subjects.

In this model, the least squares estimate of \(\pphi\) is defined as \[\hat\pphi = \argmin{\pphi} \sum_{i=1}^N \sum_{j=1}^{n_i} (y_{ij} - f(t_{ij} ,\pphi))^2\]

Let use the nls function with the pooled data from the 12 subjects.

pkm.all <- nls(concentration ~ pk.model1(psi, time), start=list(psi=c(ka=1, V=40, ke=0.1)), data=theo.data)
coef(pkm.all)
##        psi1        psi2        psi3 
##  1.57975378 33.42183639  0.07931028
new.df$pred.all <- predict(pkm.all, newdata=new.df)
pl +  geom_line(data=new.df, aes(x=time,y=pred.all), colour="#339900", size=1)

These estimated PD parameters are typical PK parameters and this PK profile is a typical PK profile for this sample of patients. By definition, they do not take into acount the variability between the patients.


2.3 Fitting several nonlinear models to several subjects

We can instead fit the same PK model with different parameters to each subject, exactly a we did above with the first patient: \[y_{ij} = f(t_{ij} ,\pphi_i) + e_{ij} \quad , \quad 1\leq i \leq N \ , \ 1\leq j \leq n_i\] where \(\pphi_i\) is the vector of PK parameters for patient \(i\).

In this model, the least squares estimate of \(\pphi_i\) is defined as \[\hat\pphi_i = \argmin{\pphi} \sum_{j=1}^{n_i} (y_{ij} - f(t_{ij} ,\pphi))^2 \quad , \quad 1\leq i \leq N\]

psi.est <- data.frame()
fpred <- NULL
N <- 12
for (i in (1:N)) {
  pkmi <- nls(concentration ~ pk.model1(psi, time), 
              start=list(psi=c(ka=1, V=40,k=0.08)), 
              data=subset(theo.data, id==i))
  psi.est <- rbind(psi.est,coef(pkmi))
  fpred <- c(fpred, predict(pkmi, newdata=new.df))
}
names(psi.est) <- c("ka","V","ke")
psi.est
##           ka        V         ke
## 1  1.7774108 29.39416 0.05395459
## 2  1.9426715 32.02478 0.10166096
## 3  2.4535655 34.31929 0.08142497
## 4  1.1714794 31.09742 0.08746677
## 5  1.4714930 26.92498 0.08843553
## 6  1.1637229 41.10447 0.09952642
## 7  0.6797367 32.62140 0.10224630
## 8  1.3755233 35.69195 0.09195673
## 9  8.8656399 38.94820 0.08663190
## 10 0.6954999 25.51965 0.07396632
## 11 3.8490488 37.94531 0.09812324
## 12 0.8328982 24.01748 0.10557581

Each individual predicted concentration \(f(t,\hat\pphi_i)\) seems to predict quite well the observed concentrations for the 12 subjects:

nc <- length(new.df$time)
theo.pred <- data.frame(id=rep((1:12),each=nc),time=rep(new.df$time,12), fpred=fpred)
pl + geom_line(data=theo.pred, aes(x=time,y=fpred), colour="#339900", size=0.75) + facet_wrap(~id)


3 Nonlinear mixed effects (NLME) model

3.1 A first basic model

Until now, the individual parameters \((\pphi_i)\) were considered a fixed effects: we didn???t make any assumption about there possible values.

In a population approach, the \(N\) subjects are assumed to be randomly sampled from a same population of individuals. Then, each individual parameter \(\pphi_i\) is treated as a random variable.

We will start assuming that the \(\pphi_i\)???s are independent and normally distributed:

\[\pphi_i \iid {\cal N}(\pphi_{\rm pop} , \Omega)\] where \(\pphi_{\rm pop}\) is a \(d\)-vector of population parameters and \(\Omega\) a \(d\times d\) variance-covariance matrix.

Remark: this normality assumption allows us to decompose each individual parameter \(\pphi_i\) into a fixed effect $ _{}$ and a random effect \(\eta_i\):

\[\pphi_i = \pphi_{\rm pop} + \eta_i\] where \(\eta_i \iid {\cal N}(0 , \Omega)\).

We will also start asuming that the residual errors \((e_{ij})\) are independent and normally distributed: \(e_{ij} \iid {\cal N}(0 , \asigma^2)\).

In summary, we can equivalently represent a (nonlinear) mixed effects model

i) using equations:

\[\begin{aligned} y_{ij} & = f(t_{ij} ,\pphi_i) + e_{ij} \\ \pphi_i & = \pphi_{\rm pop} + \eta_i \end{aligned}\] where \(e_{ij} \iid {\cal N}(0 , \asigma^2)\) and \(\eta_i \iid {\cal N}(0 , \Omega)\),

ii) or using probability distributions:

\[\begin{aligned} y_{ij} &\sim {\cal N}(f(t_{ij} ,\pphi_i) \ , \ \asigma^2) \\ \pphi_i & \sim {\cal N}(\pphi_{\rm pop} , \Omega) \end{aligned}\] The model is the joint probability distribution of \((y,\pphi)^\), where \(y=(y_{ij}, 1\leq i \leq N, 1 \leq j \leq n_i)\) is the complete set of observations and \(\pphi = (\pphi_i, 1\leq i \leq N)\) the \(N\) vectors of individual parameters, \[\begin{aligned} \pmacro(y,\pphi;\theta) &= \prod_{i=1}^N \pmacro(y_i,\pphi_i;\theta) \\ &= \prod_{i=1}^N \pmacro(y_i|\pphi_i;\theta)\pmacro(\pphi_i;\theta) \end{aligned}\]


3.2 Tasks, methods and algorithms

A detailed presentation of all the existing methods and algorithms for nonlinear mixed effect models goes far beyond the scope of this course. We will restrict ourselves to the methods and algorithms implemented in the saemix package.

3.2.1 Estimation of the population parameters

The parameters of the model are \(\theta=(\pphi_{\rm pop}, \Omega, \asigma^2)\). Maximum likelihood estimation of \(\theta\) consists of maximizing with respect to \(\theta\) the observed likelihood function defined by \[\begin{aligned} \like(\theta,y) &\eqdef \pmacro(y ; \theta) \\ &= \int \pmacro(y,\pphi ;\theta) \, d \pphi \\ &= \prod_{i=1}^N\int \pmacro(y_i|\pphi_i ;\theta)\pmacro(\pphi_i ;\theta) \, d \pphi_i . \end{aligned}\]

If \(f\) is a nonlinear function of \(\pphi_i\), then \(y_i\) is not a Gaussian vector and the likelihood function \(\like(\theta,y)\) cannot be computed in a closed form.

Several algorithms exists for maximum likelihood estimation in nonlinear mixed effects models. In particular, the stochastic approximation EM algorithm (SAEM) is an iterative algorithm that converges to a maximum of the likelihood function under general conditions.


3.2.2 Estimation of the individual parameters

Once \(\theta\) has been estimated, the conditional distribution \(\pmacro(\pphi_i | y_i ; \hat{\theta})\) can be used for each individual \(i\) for estimating the vector of individual parameters \(\pphi_i\).

The mode of this conditional distribution is defined as \[\begin{aligned} \hat{\pphi}_i &= \argmax{\pphi_i}\pmacro(\pphi_i | y_i ; \hat{\theta}) \\ &= \argmin{\pphi_i} \left\{ -2\log(\pmacro(\pphi_i | y_i ; \hat{\theta})) \right\}\\ &= \argmin{\pphi_i} \left\{-2\log\pmacro( y_i | \pphi_i ; \hat{\theta}) -2 \log\pmacro(\pphi_i ; \hat{\theta}) \right\}\\ &= \argmin{\pphi_i} \left\{ \frac{1}{\hat\asigma^2}\|y_i - f(t_i,\pphi_i)\|^2 + (\pphi_i-\hat\pphi_{\rm pop})^\prime\Omega^{-1}(\pphi_i-\hat\pphi_{\rm pop}) \right\} \end{aligned}\]

This estimate is called the maximum a posteriori (MAP) estimate, or the empirical Bayes estimate (EBE) of \(\pphi_i\).

Remark: since \(f\) is a nonlinear function of \(\pphi_i\), there is no analytical expression of \(\hat\pphi_i\). A Newton-type algorithm should then be used to carry out this minimization problem.

We can then use the conditional mode for computing predictions, taking the philosophy that the most likely values of the individual parameters are the most suited for computing the most likely predictions:

\[\widehat{f(t_{ij} , \pphi_i)} = f(t_{ij} , \hat\pphi_i).\]


3.2.3 Estimation of the likelihood function

Performing likelihood ratio tests and computing information criteria for a given model requires computation of the log-likelihood \(\llike(\hat{\theta};y) \eqdef \log(\pmacro(y;\hat{\theta}))\).

The log-likelihood cannot be computed in closed form for nonlinear mixed effects models. In the case of continuous data, approximation of the model by a Gaussian linear one allows us to approximate the log-likelihood.

Indeed, we can linearize the model for the observations \((y_{ij},\, 1\leq j \leq n_i)\) of individual \(i\) around the vector of predicted individual parameters \(\hat\pphi_i\).

Let \(\Dphi{f(t , \pphi)}\) be the row vector of derivatives of \(f(t , \pphi)\) with respect to \(\pphi\). Then, \[\begin{aligned} y_{ij} &\simeq f(t_{ij} , \hat\pphi_i) + \Dphi{f(t_{ij} , \hat\pphi_i)} \, (\pphi_i - \hat\pphi_i) + e_{ij} \\ &\simeq f(t_{ij} , \hat\pphi_i) + \Dphi{f(t_{ij} , \hat\pphi_i)} \, (\hat\pphi_{\rm pop} - \hat\pphi_i) + \Dphi{f(t_{ij} , \hat\pphi_i)} \, \eta_i + e_{ij} . \end{aligned}\] Following this, we can approximate the marginal distribution of the vector \(y_i\) by a normal one: \[ y_{i} \approx {\cal N}\left(\mu_i\, , \, \Gamma_i \right), \] where \[\begin{aligned} \mu_i & = f(t_{i} , \hat\pphi_i) + \Dphi{f(t_{i} , \hat\pphi_i)}(\hat\pphi_{\rm pop} - \hat\pphi_i) \\ \Gamma_i &= \Dphi{f(t_{i} , \hat\pphi_i)} \, \hat\Omega \, \Dphi{f(t_{i} , \hat\pphi_i)}^{\prime} + \hat\asigma^2\, I_{n_i} \end{aligned}\]

The log-likelihood function is then approximated by \[\begin{aligned} \llike(\hat{\theta};y) & = \sum_{i=1}^N \llike(\hat{\theta};y_i)\\ & \simeq \sum_{i=1}^N \left\{ -\frac{n_i}{2}\log(2 \pi) -\frac{1}{2}\log(|\Gamma_i|) -\frac{1}{2} (y_i - \mu_i)^\prime \Gamma_i^{-1} (y_i - \mu_i) \right\} \end{aligned}\]


3.2.4 Estimation of the Fisher information matrix

The variance of the maximum likelihood estimate (MLE) \(\hat{\theta}\), and thus confidence intervals, can be derived from the observed Fisher information matrix (FIM), itself derived from the observed likelihood: \[\begin{aligned} I({\hat\theta}) & \eqdef \ - \frac{\partial^2}{\partial \theta \partial \theta^\prime} \llike(\hat{\theta};y) \\ & \simeq \frac{1}{2}\sum_{i=1}^N \frac{\partial^2}{\partial \theta \partial \theta^\prime}\left\{ \log(|\Gamma_i|) + (y_i - \mu_i)^\prime \Gamma_i^{-1} (y_i - \mu_i) \right\} \end{aligned}\]

The variance-covariance matrix of \(\hat\theta\) can then be estimated by the inverse of the observed FIM. Standard errors (s.e.) for each component of \(\hat\theta\) are the standard deviations, i.e., the square root of the diagonal elements of the variance-covariance matrix.


3.3 Fitting a NLME model to the theophylline data

Let us see how to use the saemix package for fitting our model to the theophylline data.

We first need to create a SaemixData object, defining which column of the data file should be used and their role. In our example, concentration is the response variable \(y\), time is the explanatory variable (or predictor) \(t\) and id is the grouping variable.

library(saemix)
saemix.data <- saemixData(name.data       = theo.data,
                          name.group      = c("id"),
                          name.predictors = c("time"),
                          name.response   = c("concentration"))

The structural model is the one compartment model with first order absorption and linear elimination previously used,

model1cpt <- function(psi,id,x) { 
  D   <- 320
  t   <-x[,1] 
  ka  <-psi[id,1]
  V   <-psi[id,2]
  ke  <-psi[id,3]
  fpred <-D*ka/(V*(ka-ke))*(exp(-ke*t)-exp(-ka*t))
  return(fpred)
}

The model is defined in a saemixModel object. The structural model and some initial values for the vector of population parameters \(\pphi_{\rm pop}\) are required

saemix.model <- saemixModel(model = model1cpt, 
                            psi0  = c(ka=1,V=20,ke=0.5))

Several options for selecting and running the algorithms can be defined, including

  • algorithms, a 3-vector of 1s and 0s specifying which algorithms are to be run.
    1. estimation of the individual parameters (MAP estimates),
    2. estimation of the Fisher Information Matrix and log-likelihood by linearisation,
    3. estimation of the log-likelihood by importance sampling (ignored for this course)
  • seed, a integer used for the random number generator: running the algorithms several times with the same seed ensures that the results will be the same.
saemix.options <- list(algorithms=c(1,1,0), displayProgress=FALSE, seed=632545)
saemix.fit1    <- saemix(saemix.model, saemix.data, saemix.options)

A summary of the results of the etimation algorithm can be displayed

saemix.fit1@results
## Fixed effects
##  Parameter Estimate   SE     CV(%)
##  ka         1.8330  0.34746 18.96 
##  V         32.6162  1.61630  4.96 
##  ke         0.0878  0.00598  6.81 
##  a          0.7447  0.05634  7.57 
## 
## Variance of random effects
##  Parameter Estimate   SE      CV(%)
##  omega2.ka 1.25e+00 5.69e-01 45.4  
##  omega2.V  2.34e+01 1.14e+01 48.7  
##  omega2.ke 1.62e-04 1.55e-04 95.8  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 345.0751 
##        AIC= 359.0751 
##        BIC= 362.4694

The individual parameters estimates are also available

psi <- psi(saemix.fit1)
head(psi)
##         ka        V         ke
## 1 1.638806 28.00117 0.06551392
## 2 2.028364 32.97536 0.09344303
## 3 2.284325 33.45927 0.08618802
## 4 1.198722 31.39178 0.08653477
## 5 1.523666 27.49394 0.08577087
## 6 1.132918 39.87603 0.09648492

These individual parameter estimates can be used for computing and plotting individual predictions

saemix.fit <- predict(saemix.fit1)
saemix.plot.fits(saemix.fit1)

Several diagnostic fit plots can be displayed, including the plot of the observations versus individual predictions

saemix.plot.obsvspred(saemix.fit1,level=1)

and the plot of the residuals versus time, and versus individual predictions,

saemix.plot.scatterresiduals(saemix.fit1, level=1)


3.4 Some extensions of the model

3.4.1 The residual error model

In the model \(y_{ij} = f(t_{ij} ,\pphi_i) + e_{ij}\), the residual errors \((e_{ij})\) are assumed to be Gaussian random variables with mean 0. Different models can be used for the variance of the \((e_{ij})\) in a nonlinear mixed effects model. The following are some of them (more details about residual error models here).

Constant error model:

The residual errors \((e_{ij})\) are independent and identically distributed: \[e_{ij} \iid {\cal N}(0 \ , \ \asigma^2)\] The variance of \(y_{ij}\) is therefore constant over time: \[y_{ij} = f(t_{ij} ,\pphi_i) + \asigma \varepsilon_{ij}\] where \(\varepsilon_{ij} \iid {\cal N}(0, 1)\).

The error model can be defined as an argument of saemixModel (default is the constant error model)

saemix.model <- saemixModel(model=model1cpt, psi0=c(ka=1,V=20,ke=0.5), error.model="constant")
fit.constant <- saemix(saemix.model, saemix.data, saemix.options)
fit.constant@results
## Fixed effects
##  Parameter Estimate   SE     CV(%)
##  ka         1.8330  0.34746 18.96 
##  V         32.6162  1.61630  4.96 
##  ke         0.0878  0.00598  6.81 
##  a          0.7447  0.05634  7.57 
## 
## Variance of random effects
##  Parameter Estimate   SE      CV(%)
##  omega2.ka 1.25e+00 5.69e-01 45.4  
##  omega2.V  2.34e+01 1.14e+01 48.7  
##  omega2.ke 1.62e-04 1.55e-04 95.8  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 345.0751 
##        AIC= 359.0751 
##        BIC= 362.4694

Proportional error model:

Proportional error models assume that the standard deviation of \(e_{ij}\) is proportional to the predicted response: \(e_{ij} = \ b\, f(t_{ij} ,\pphi_i) \, \varepsilon_{ij}\) where \(\varepsilon_{ij} \iid {\cal N}(0, 1)\). Then,

\[y_{ij} = f(t_{ij} ,\pphi_i) + b f(t_{ij} ,\pphi_i) \, \varepsilon_{ij}\]

saemix.model <- saemixModel(model=model1cpt, psi0=c(ka=1,V=20,ke=0.5), error.model="proportional")
fit.proportional <- saemix(saemix.model, saemix.data, saemix.options)
fit.proportional@results
## Fixed effects
##  Parameter Estimate   SE      CV(%)
##  ka         1.741   0.321194 18.453
##  V         33.001   1.570139  4.758
##  ke         0.087   0.004716  5.423
##  b          0.161   0.000833  0.519
## 
## Variance of random effects
##  Parameter Estimate   SE      CV(%)
##  omega2.ka 1.05e+00 4.88e-01 46.3  
##  omega2.V  2.03e+01 1.12e+01 55.5  
##  omega2.ke 1.82e-04 1.45e-04 79.7  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 339.7901 
##        AIC= 353.7901 
##        BIC= 357.1844

Combined error model:

A combined error model additively combines a constant and a proportional error model: \(e_{ij} = (\asigma +\ b\, f(t_{ij} ,\pphi_i)) \, \varepsilon_{ij}\) where \(\varepsilon_{ij} \iid {\cal N}(0, 1)\). Then,

\[y_{ij} = f(t_{ij} ,\pphi_i) + (a + b f(t_{ij} ,\pphi_i)) \, \varepsilon_{ij}\]

saemix.model <- saemixModel(model=model1cpt, psi0=c(ka=1,V=20,ke=0.5), error.model="combined")
fit.combined <- saemix(saemix.model, saemix.data, saemix.options)
fit.combined@results
## Fixed effects
##  Parameter Estimate   SE     CV(%)
##  ka         1.7955  0.34457 19.19 
##  V         32.6390  1.60949  4.93 
##  ke         0.0881  0.00574  6.52 
##  a          0.4005  0.05438 13.58 
##  b          0.0634  0.00215  3.40 
## 
## Variance of random effects
##  Parameter Estimate   SE      CV(%)
##  omega2.ka 1.24e+00 5.61e-01  45.4 
##  omega2.V  2.26e+01 1.16e+01  51.2 
##  omega2.ke 1.92e-04 1.93e-04 101.0 
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 338.6248 
##        AIC= 354.6248 
##        BIC= 358.504

Exponential error model:

If \(y\) is known to take non negative values, a log transformation can be used. We can then write the model with one of two equivalent representations: \[\begin{aligned} \log(y_{ij}) & = \log(f(t_{ij} ,\pphi_i)) + \asigma \varepsilon_{ij} \\ y_{ij} & = f(t_{ij} ,\pphi_i) \ e^{\asigma \varepsilon_{ij}} \end{aligned}\]

saemix.model <- saemixModel(model=model1cpt, psi0=c(ka=1,V=20,ke=0.5), error.model="exponential")
fit.exponential <- saemix(saemix.model, saemix.data, saemix.options)
fit.exponential@results
## Fixed effects
##  Parameter Estimate   SE    CV(%)
##  ka         1.4599  0.2659 18.22 
##  V         32.1617  1.6094  5.00 
##  ke         0.0891  0.0049  5.51 
##  a          0.1773  0.0135  7.59 
## 
## Variance of random effects
##  Parameter Estimate   SE      CV(%)
##  omega2.ka  0.71590 3.32e-01 46.4  
##  omega2.V  19.74232 1.15e+01 58.3  
##  omega2.ke  0.00018 1.08e-04 59.9  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 364.5556 
##        AIC= 378.5556 
##        BIC= 381.9499


3.4.2 Transformation of the individual parameters

Clearly, not all distributions are Gaussian. To begin with, the normal distribution has support \(\Rset\), unlike many parameters that take values in precise intervals. For instance, some variables take only positive values (e.g., volumes and transfer rate constants) and others are restricted to bounded intervals.

Furthermore, the Gaussian distribution is symmetric, which is not a property shared by all distributions. One way to extend the use of Gaussian distributions is to consider that some transformation of the parameters in which we are interested is Gaussian.

i.e., assume the existence of a monotonic function \(h\) such that \(h(\pphi_i)\) is normally distributed. For a sake of simplicity, we will consider here a scalar parameter \(\pphi_i\). We then assume that

\[ h(\pphi_i) \sim {\cal N}(h(\pphi_{\rm pop}) \ , \ \omega^2).\]

Or, equivalently, \[h(\pphi_i) = h(\pphi_{\rm pop}) + \eta_i\] where \(\eta_i \sim {\cal N}(0,\omega^2)\).

Let us now see some examples of transformed normal pdfs.

Log-normal distribution:

A log-normal distributions ensures nonnegative values and is widely used for describing distributions of physiological parameters.

If \(\pphi_i\) is log-normally distributed, the 3 following representations are equivalent:

\[\begin{aligned} \log(\pphi_i) & \sim {\cal N}( \log(\pphi_{\rm pop}) \ , \omega^2) \\ \log(\pphi_i) &= \log(\pphi_{\rm pop}) + \eta_i \\ \pphi_i &= \pphi_{\rm pop} e^{\eta_i} \end{aligned}\]

Logit-normal distribution:

The logit function is defined on \((0,1)\) and take its value in \(\Rset\): For any \(x\) in \((0,1)\), \[ \logit(x) = \log \left(\frac{x}{1-x}\right) \Longleftrightarrow x = \frac{1}{1+e^{-\logit(x)}} \]

An individual parameter \(\pphi_i\) with a logit-normal distribution takes its values in \((0,1)\). The logit of \(\pphi\) is normally distributed, i.e., \[\logit(\pphi_i) = \log \left(\frac{\pphi_i}{1-\pphi_i}\right) \ \sim \ \ {\cal N}( \logit(\pphi_{\rm pop}) \ , \ \omega^2),\] Probit-normal distribution:

The probit function is the inverse cumulative distribution function (quantile function) \(\pphi^{-1}\) associated with the standard normal distribution \({\cal N}(0,1)\). For any \(x\) in \((0,1)\), \[ \probit(x) = \pphi^{-1}(x) \ \Longleftrightarrow \prob{{\cal N}(0,1) \leq \probit(x)} = x \]

An individual parameter \(\pphi_i\) with a probit-normal distribution takes its values in \((0,1)\). The probit of \(\pphi_i\) is normally distributed: \[\probit(\pphi_i) = \pphi^{-1}(\pphi_i) \ \sim \ {\cal N}( \probit(\pphi_{\rm pop}) \ , \ \omega^2) . \]

The distribution of each individual parameter can be defined using the argument transform.par (0=normal, 1=log-normal, 2=probit, 3=logit). Default are normal distributions, i.e.??a vector of 0.

If we want to use, for example, a normal distribution for \(V\) and log-normal distributions for \(\ka\) and \(\kel\), then, transform.par should be the vector c(1,0,1):

saemix.model<-saemixModel(model         = model1cpt,
                          psi0          = c(ka=1,V=20,ke=0.5),
                          transform.par = c(1,0,1))

saemix.fit2 <-saemix(saemix.model, saemix.data, saemix.options)
saemix.fit2@results
## Fixed effects
##  Parameter Estimate   SE     CV(%)
##  ka         1.5973  0.30947 19.37 
##  V         32.7146  1.70039  5.20 
##  ke         0.0866  0.00575  6.64 
##  a          0.7315  0.05549  7.59 
## 
## Variance of random effects
##  Parameter Estimate   SE     CV(%)
##  omega2.ka  0.400    0.1780  44.5 
##  omega2.V  26.737   12.6829  47.4 
##  omega2.ke  0.018    0.0193 106.9 
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 337.8782 
##        AIC= 351.8782 
##        BIC= 355.2725

Remark: here, \(\omega^2_{ka}\) and \(\omega^2_{ke}\) are the variances of \(\log(\ka_i)\) and \(\log(\kel_i)\) while \(\omega^2_{V}\) is the variance of \(V_i\).


3.4.3 Models with covariates

Let \(c_i = (c_{i1},c_{i2}, \ldots , c_{iL})\) be a vector of individual covariates, i.e.??a vector of individual parameters available with the data. We may want to explain part of the variability of the non observed individual parameters \((\pphi_i)\) using these covariates.

We will only consider linear models of the covariates. More precisely, assuming that \(h(\pphi_i)\) is normally distributed, we will decompose \(h(\pphi_i)\) into fixed and random effects: \[h(\pphi_i) = h(\pphi_{\rm pop}) + \sum_{\ell=1}^L c_{i\ell}\beta_{\ell} + \eta_i\] Remark: \(\pphi_{\rm pop}\) is the typical value of \(\pphi_i\) if the covariates \(c_{i1}\), ???, \(c_{iL}\) are zero for a typical individual of the population.

Let us consider a model where the volume \(V_i\) is normally distributed and is a linear function of the weight \(w_i\):

\[V_i = \beta_0 + \beta \, w_i + \eta_{V,i}\] Assuming that the weight of a typical individual of the population is \(w_{\rm pop}\), the predicted volume for this individual is not \(\beta_0\) but \(\beta_0 + \beta w_{\rm pop}\).

If we use instead the centered weight \(w_i-w_{\rm pop}\), we can now write the model as \[V_i = V_{\rm pop} + \beta \, (w_i-w_{\rm pop}) + \eta_{V,i} \ .\] Indeed, the predicted volume for a typical individual is now \(V_{\rm pop}\).

Assume that we decide to use 70kg as typical weight in the theophylline study. The saemixData object now needs to include \(w_i-70\):

theo.data$w70 <- theo.data$weight - 70
saemix.data<-saemixData(name.data=theo.data,
                        name.group=c("id"),
                        name.predictors=c("time"),
                        name.response=c("concentration"),
                        name.covariates=c("w70"))

Here, only the volume \(V\) is function of the weight. The covariate model is thererefore encoded as the (row) vector (0,1,0).

saemix.model <- saemixModel(model           = model1cpt,
                            psi0            = c(ka=1,V=20,ke=0.5),
                            transform.par   = c(1,0,1),
                            covariate.model = t(c(0,1,0)))

saemix.fit3 <- saemix(saemix.model, saemix.data, saemix.options)
saemix.fit3@results
## Fixed effects
##  Parameter   Estimate   SE     CV(%) p-value
##  ka           1.5774  0.30678 19.45  -      
##  V           32.5546  1.35758  4.17  -      
##  beta_w70(V)  0.3406  0.13543 39.76  0.00595
##  ke           0.0879  0.00638  7.26  -      
##  a            0.7266  0.05498  7.57  -      
## 
## Variance of random effects
##  Parameter Estimate   SE    CV(%)
##  omega2.ka  0.404   0.1795 44.4  
##  omega2.V  14.281   7.7478 54.3  
##  omega2.ke  0.029   0.0225 77.4  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 333.5542 
##        AIC= 349.5542 
##        BIC= 353.4335

Here, \(\hat\beta_{w70} = 0.33\) means that an increase of the weight of 1kg leads to an increase of the predicted volume of 0.33l.

The p-value of the test \(H_0: \ \beta_{w70}=0\) versus \(H_1: \ \beta_{w70}\neq 0\) is 0.01 We can then reject the null and conclude that the predicted volume significantly increases with the weight.

Imagine that we now use a log-normal distribution for the volume \(V_i\). It is now the log-volume which is a linear function of the transformed weight.

We can assume, for instance, that the log-volume is a linear function of the centered log-weight: \[\log(V_i) = \log(V_{\rm pop}) + \beta \, \log(w_i/w_{\rm pop}) + \eta_{V,i} \ .\] Or, equivalently, \[V_i = V_{\rm pop} \left(\frac{w_i}{w_{\rm pop}}\right)^{\beta} e^{\eta_{V,i}} \ .\] We see that, using this model, the predicted volume for a typical individual is \(V_{\rm pop}\).

The saemixData object now needs to include \(\log(w_i/70)\) a covariate,

theo.data$lw70 <- log(theo.data$weight/70)
saemix.data<-saemixData(name.data=theo.data,
                        name.group=c("id"),
                        name.predictors=c("time"),
                        name.response=c("concentration"),
                        name.covariates=c("lw70"))

The covariate model is again encoded as the (row) vector (0,1,0) but the transformation is now encoded as 1 for the three parameters

saemix.model<-saemixModel(model           = model1cpt,
                          psi0            = c(ka=1,V=20,ke=0.5),
                          transform.par   = c(1,1,1),
                          covariate.model = t(c(0,1,0)))

saemix.fit4 <- saemix(saemix.model, saemix.data, saemix.options)
saemix.fit4@results
## Fixed effects
##  Parameter    Estimate   SE     CV(%) p-value
##  ka            1.5857  0.30803 19.43  -      
##  V            32.4727  1.40745  4.33  -      
##  beta_lw70(V)  0.7638  0.29934 39.19  0.00536
##  ke            0.0876  0.00598  6.83  -      
##  a             0.7272  0.05499  7.56  -      
## 
## Variance of random effects
##  Parameter Estimate   SE     CV(%)
##  omega2.ka 0.4030   0.17880 44.4  
##  omega2.V  0.0151   0.00785 52.1  
##  omega2.ke 0.0216   0.01960 90.9  
## 
## Statistical criteria
## Likelihood computed by linearisation
##       -2LL= 333.241 
##        AIC= 349.241 
##        BIC= 353.1203


3.4.4 Correlations between random effects

Until now, the random effects were assumed to be uncorrelated, i.e.??the veariance-covariance matrix \(\Omega\) was a diagonal matrix (default covariance model for saemix).

Correlations between random effects can be introduced with the input argument covariance.model, a square matrix of size equal to the number of parameters in the model, giving the variance-covariance structure of the model: 1s correspond to estimated variances (in the diagonal) or covariances (off-diagonal elements). The structure of the matrix \(\Omega\) should be block.

Consider, for instance a model where \(\ka\) is fixed in the population, i.e. \(\omega_{\ka}=0\) (and thus \(\ka_i=0\) for all \(i\)), and where \(\log(V)\) and \(\log(\kel)\) are correlated, i.e \(\eta_V\) and \(\eta_{\kel})\) are correlated:

saemix.model<-saemixModel(model           = model1cpt,
                          psi0            = c(ka=1,V=20,ke=0.5),
                          transform.par   = c(1,1,1),
                          covariate.model = t(c(0,1,0)),
                          covariance.model = matrix(c(0,0,0,0,1,1,0,1,1),nrow=3))

saemix.fit5 <- saemix(saemix.model, saemix.data, saemix.options)
saemix.fit5@results
## Fixed effects
##  Parameter    Estimate   SE     CV(%) p-value
##  ka            1.5346  0.13199   8.60 -      
##  V            33.1312  1.74154   5.26 -      
##  beta_lw70(V)  0.3301  0.33017 100.01 0.159  
##  ke            0.0827  0.00987  11.93 -      
##  a             1.0777  0.07729   7.17 -      
## 
## Variance of random effects
##  Parameter Estimate   SE     CV(%)
##  omega2.V  0.0172   0.00945 54.8  
##  omega2.ke 0.0924   0.05438 58.9  
## 
## Statistical criteria
## 
## Correlation matrix of random effects
##           omega2.V omega2.ke
## omega2.V   1.000   -0.289   
## omega2.ke -0.289    1.000   
## Likelihood computed by linearisation
##       -2LL= 393.1419 
##        AIC= 409.1419 
##        BIC= 413.0211