Monday, 21st of January (2019)
Flannick & Florez (2016)
Marullo, El-Sayed Moustafa, & Prokopenko (2014)
Scott et al. (2012)
Yaghootkar & Frayling (2013)
"In all cases, the glucose-raising allele was associated with increased risk of T2D, yet fasting glucose effect sizes and T2D ORs were weakly correlated"
Scott et al. (2012)
To identify biomarker relevant to a disease
To identify the effect of a treatment on a disease
(independently of the association between the disease and the biomarker)
For example:
(Extended) Cox model: \[\begin{align} \lambda_i(t)=\lambda_0(t) \exp(\beta Y_i(t) + \alpha Z_i + \eta W_i) \end{align}\]
Where:
coxph(Surv(Time, death) ~ drug, data = aids.id)
coxph(Surv(Time, death) ~ drug, data = aids.id)
#> Call: #> coxph(formula = Surv(Time, death) ~ drug, data = aids.id, x = TRUE) #> #> n= 467, number of events= 188 #> #> coef exp(coef) se(coef) z Pr(>|z|) #> drugddI 0.2102 1.2339 0.1462 1.437 0.151 #> #> exp(coef) exp(-coef) lower .95 upper .95 #> drugddI 1.234 0.8104 0.9264 1.643 #> #> Concordance= 0.531 (se = 0.019 ) #> Rsquare= 0.004 (max possible= 0.99 ) #> Likelihood ratio test= 2.07 on 1 df, p=0.2 #> Wald test = 2.07 on 1 df, p=0.2 #> Score (logrank) test = 2.07 on 1 df, p=0.1
(Generalised) linear mixed effect model: \[Y_{i}(t_{ij})=X_{i}(t_{ij})+\epsilon_{i}(t_{ij})\]
where:
lme(sqrt(CD4) ~ obstime*drug - drug, random = ~ 1|patient, data = aids)
lme(sqrt(CD4) ~ obstime*drug - drug, random = ~ 1|patient, data = aids)
#> Linear mixed-effects model fit by REML #> Data: aids #> AIC BIC logLik #> 2730.729 2756.957 -1360.364 #> #> Random effects: #> Formula: ~1 | patient #> (Intercept) Residual #> StdDev: 0.8719493 0.4106749 #> #> Fixed effects: sqrt(CD4) ~ obstime * drug - drug #> Value Std.Error DF t-value p-value #> (Intercept) 2.5087153 0.04306984 936 58.24761 0.0000 #> obstime -0.0338783 0.00352643 936 -9.60697 0.0000 #> obstime:drugddI 0.0055813 0.00498557 936 1.11949 0.2632 #> Correlation: #> (Intr) obstim #> obstime -0.153 #> obstime:drugddI 0.003 -0.691 #> #> Standardized Within-Group Residuals: #> Min Q1 Med Q3 Max #> -5.09514877 -0.46243667 0.01324625 0.48993786 5.17139372 #> #> Number of Observations: 1405 #> Number of Groups: 467
fitCOX <- coxph(Surv(Time, death) ~ drug, data = aids.id, x = TRUE) fitLME <- lme(sqrt(CD4) ~ obstime*drug - drug, random = ~ 1 | patient, data = aids) fitJOINT <- jointModel(fitLME, fitCOX, timeVar = "obstime", method = "piecewise-PH-aGH")
#> Variance Components: #> StdDev #> (Intercept) 0.8793445 #> Residual 0.4093577 #> #> Coefficients: #> Longitudinal Process #> Value Std.Err z-value p-value #> (Intercept) 2.5100 0.0434 57.8778 <0.0001 #> obstime -0.0362 0.0035 -10.3474 <0.0001 #> obstime:drugddI 0.0045 0.0049 0.9173 0.3590 #> #> Event Process #> Value Std.Err z-value p-value #> drugddI 0.3458 0.1523 2.2715 0.0231 #> Assoct -1.0866 0.1184 -9.1786 <0.0001 #> log(xi.1) -1.6560 0.2529 -6.5475 #> log(xi.*) ...... ..... ......
Test simultaneously an effect on:
A gain in statistical power to detect those effects
The standard (joint likelihood) formulation involves two components:
With:
(Generalised) linear mixed effect model: \[Y_{i}(t_{ij})=X_{i}(t_{ij})+\epsilon_{i}(t_{ij})\]
\(X_{ij}\) is the trajectory function, and could be defined: \[\begin{gather}X_{i}(t_{ij})=\theta_{0i} + \theta_{1i}t_{ij} + \cdots + \theta_{pi}t_{ij}^p &, & \boldsymbol\theta_p \sim \mathcal{N}(\boldsymbol\mu_, \boldsymbol\Sigma)\end{gather}\]
For simplicity here, we assume linearity over time (\(\theta_{0i}+\theta_{1i}t_{ij}\)): \[\begin{gather}Y_{i}(t_{ij})=\theta_{0i}+\theta_{1i}t_{ij}+\gamma Z_i+\delta W_i+\epsilon_{ij} &, & \boldsymbol{\theta} \sim \mathcal{N}_2 (\boldsymbol{\mu},\boldsymbol{\Sigma})\end{gather}\]
(Generalised) linear mixed effect model: \[Y_{i}(t_{ij})=\theta_{0i}+\theta_{1i}t_{ij}+\gamma Z_i+\delta W_i+\epsilon_{ij}\]
With:
(Extended) Cox model (proportional hazards): \[\begin{align} \lambda_i(t)&=\lim_{dt \to 0} \frac{P\{t\leq T_i<t+dt|T_i\geq t, \bar{Y_i}(t), Z_i, W_i\}}{dt}\\ &=\lambda_0(t) \exp\{\beta X_{i}(t) + \alpha Z_i + \eta W_i\} \end{align}\]
With:
Likelihood Ratio Test
\[LRT=-2\{\ell(\hat{\theta}_0)-\ell(\hat{\theta})\}\]
Wald Test
\[\begin{gather}
W=(\hat{\theta}-\theta_0)^\top \mathcal{I}(\hat{\theta})(\hat{\theta}-\theta_0)\\
\left(\text{Univariate: }(\hat{\theta}_j-\theta_{0j})/\widehat{\text{s.e.}}(\hat{\theta}_j)\right)
\end{gather}\]
Score Test
\[U=S^\top(\hat{\theta}_0)\{\mathcal{I}(\hat{\theta}_0)\}^{-1}S(\hat{\theta}_0)\]
What if, we split the job in two?
\(\Rightarrow\) "Two-Step"? (Tsiatis, DeGruttola, & Wulfsohn, 1995)
(Generalised) linear mixed effect model
\[\begin{align}
Y_{i}(t)&=X_i(t)+ \epsilon_{i}(t)\\
X^*_{i}(t)&=E\{X_{i}(t)|\bar{Y_i}(t), T_i\geq t\}
\end{align}\]
(Extended) Cox model (proportional hazards)
\[h_i(t)=h_0(t) \exp\{\beta X^*_{i}(t)\}\]
Let's keep it simple, i.e., without covariates:
the trajectory: \(Y_{i}(t)=\theta_{0i} + \theta_{1i}t + \gamma Z_i + \epsilon_{i}(t)\)
the event: \(\lambda_i(t)=\lambda_0(t) \exp\{\beta X_{i}(t) + \alpha Z_i\}\)
the time of event, e.g., the exponential distribution (Austin, 2012):
\[\begin{gather}
H_i(T_i)=\int_0^{T_i}\lambda_0(t) \exp(\beta X_i(t)+\alpha Z_i)dt , & \lambda_0(t)=\lambda\\
F_i(T_i)=1-exp(-H_i(T_i))=u , & u\sim\mathcal{U}(0, 1)
\end{gather}\] \[T_i=\frac{1}{\beta\theta_{1i}}\log\left(1-\frac{\beta\theta_{1i}\times \log(1-u)}{\lambda \exp(\beta\theta_{0i}+(\beta\gamma+\alpha)Z_i)}\right)\]
Scott et al. (2012)
Yaghootkar & Frayling (2013)
Parameters | Values |
---|---|
Number of participants (\(n\)) | 4,352 |
Number of measures (\(m\)) | 4 |
Diabetes incidence rate (\(d\)) | 0.0384 |
Minor allele frequency (\(f\)) | 0.244 |
Random effects (\(\theta\)) | \(\sim\mathcal{N}_2\left (\begin{bmatrix}4.55\\0.0108\end{bmatrix} , \begin{bmatrix} 0.143 & -0.00109 \\ -0.00109 & 6.8\times 10^{-04} \end{bmatrix} \right )\) |
SNP effect on \(Y_{ij}\) (\(\gamma\)) | 0.0229 |
SNP effect on \(T_i\) (\(\alpha\)) | 0.265 (OR=1.3) |
Association between \(Y_{ij}\) and \(T_i\) (\(\beta\)) | 3.17 |
Error term (\(\epsilon\)) | \(\sim\mathcal{N}(0,0.305^2)\) |
Let's do some power calculation (chen_sample_2011):
\[\begin{gather}
H_0:\ \beta\gamma+\alpha=0 \\
d=\frac{(z_{\tilde{\beta}}+z_{1-\tilde{\alpha}})^2}{f(1-f)(\beta\gamma+\alpha)^2} \\
z_{\tilde{\beta}}=\pm\sqrt{df(1-f)(\beta\gamma+\alpha)^2}+z_{1-\tilde{\alpha}}
\end{gather}\]
\(f\), the allele frequency;
\(\Rightarrow\tilde{\beta}_{\tilde{\alpha}}=46.56\%\) for TCF7L2 (rs17747324).