Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Latent process model for multivariate heterogeneous longitudinal data: application to cognitive aging ´ ´ ene ` Jacqmin-Gadda Cecile Proust-Lima & Hel Department of Biostatistics, INSERM U897, University of Bordeaux 2
INSERM workshop 205 - june 2010 - Saint Raphael
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Cognitive aging in the elderly Dementia characterized by a progressive and continuous decline of cognitive functions → heterogeneous cognitive aging : normal/ pathological Cognition : latent process defined in continuous time → interest in the evolution of this quantity Psychometric tests : noisy measures of cognitive functions → collected in discrete times → usually one test as a reference marker of cognition → specific metrological properties (ceiling/floor effects,...)
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Objective Describe the different profiles of cognitive decline associated with dementia in the elderly 2 statistical problems addressed : 1. multiple markers of cognition (& different properties) → nonlinear latent process model 2. heterogeneity of the declines & association with dementia → joint latent class model ⇒ Joint latent class model for multivariate longitudinal data
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Latent variable modelling (LVM) Interest in a latent variable (“construct”) measured by outcomes Cognition Latent Observed Test 1 ... Test k ... Test K
ex1 : cognition measured by psychometric tests ex2 : arithmetic reasonning measured by multi-item questionnaire
Principle : - Structural equations : latent variable described according to covariates, time, etc - Measurement model : link between the latent quantity and the outcomes
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
LVM in longitudinal settings Latent process rather than latent variables defined at each time → linear mixed model for the latent process Different types of outcomes - Quantitative outcomes : standard : Gaussian (Roy, 2000) asymetric scales : non Gaussian (Proust, 2006) - Ordinal outcomes : threshold models (probit (Liu, 2006) ; proportional odds (Hambleton, 1991))
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Symetric quantitative outcome Y
Latent process → Same sensitivity at each level
Application
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Asymetric quantitative outcome Y
Latent process → Varying sensitivity depending on the level
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Ordinal outcome Y 5 4 3 2 1 0
Latent variable
→ Range of latent process values for a given test value
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Structural model for the latent process
Notations : subject i, occasion j, outcome k Latent process
Λ
Y1
...
YK1
Latent process model
YK1+1 ... YK
Measurement model for observed outcomes
Ordinal outcomes Quantitative outcomes
Λi (t) = X1i (t)T β + Zi (t)T ui , t ≥ 0 With ui ∼ MVN(µ, D) and identifiability constraints u i0 ∼ N(0, 1)
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Measurement models for ordinal outcomes Intermediate variable ˜y (with error, outcome-specific effects,...) : ˜yijk = Λi (tijk ) + X2i (t)T γ k + αik + ijk with outcome-specific random intercept α ik ∼ N(0, σαk ) - Ordinal/binary outcome Yk with Ck levels : Yijk = c ⇔ ηck ≤ ˜yijk < η(c+1)k with c ∈ {0, Ck − 1} → constraints : η0k = −∞ and ηCk k = +∞
→ Cumulative probit with Gaussian ijk and proportional odds model with logistic ijk
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Measurement models for quantitative outcomes - Gaussian outcomes Yk : Yijk − η1k = ˜yijk η2k - Non Gaussian quantitative outcomes Y k : Hk (yijk ; η) =
hk (yijk ; η1k ; η2k ) − η3k = ˜yijk η4k
→ hk = CDF Beta (Proust, Bcs, 2006 ; Proust-Lima, CSDA 2009) (hk (.; 1, 1) = Identity ⇔ special case for Gaussian outcomes)
→ hk = approximated by splines ...
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Estimated transformations for 4 psychometric tests (Proust-Lima et al., AJE, 2007) 30
40 35 30
20 IST15
MMSE
25
15 10
estimated hk 95% CI y=x
5 0
0
20
40
60
25 20 15 10
80
5 0
100
0
20
common factor 14
80
100
80
100
60 DSST
10 BVRT
60
70
12 8 6
50 40 30
4
20
2
10
0
40
common factor
0
20
40
60
common factor
80
100
0
0
20
40
60
common factor
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Joint latent class model (Lin et al., JASA, 2002) With a single marker, - Latent classes of subjects :
Latent class C
→ latent class membership : eξ0g +X1i
Latent Observed
Long. marker Y
Event (T,E)
πig = P(ci = g|X1i ) = PG
l=1
- Given class g, → specific marker evolution → specific risk of event
Tξ
1g
eξ0l +X1i T ξ1l
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Extension to heterogeneous population : JLCM Subject i Occasion j Class g Marker k
Nonlinear latent process
Latent class C
Latent process Λ
Multinomial logistic model
Proportional hazard model
Marker 1 ... Marker k ... Marker K Y1 Yk YK
Event (T,E)
Λi (t) |ci =g = Zi (t)T uig + X2i (t)T βg ← heterogeneous mixed model Yijk | Λi (tijk , ci = g),
← constraints : u0i1 ∼ N(0, 1)
← marker-specific observation equation
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Individual contribution to the likelihood For a given number of classes G, the individual contribution is :
Li (θ) =
G X g=1
πig (θ)×f (yi |ci = g; θ)×λ(Ti |ci = g; θ)Ei S(Ti |ci = g; θ)
with f (yi |ci = g; θ) :
- closed form for quantitative outcomes (jacobian) - multivariate numerical integral over u ig & αik for ordinal outcomes
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Maximum likelihood estimators
- Log-likelihood l(ψ) = Marquardt algorithm
PN
i=1
ln(Li ) maximised by a
- Estimation achieved for a fixed number of latent classes G & G selected using the Bayesian Information Criterion (BIC) - Program in Fortran90/ R function in progress ...
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Posterior classification
2 posterior class-membership probabilities : y,T ˆ = P(ci = g | yi , (Ti , Ei ), xi ; θ) π ˆig
→ used to assess the goodness-of-fit
ˆ ˆ y ˆ = PP(ci = g | xi ; θ)f (yi | ci = g, xi ; θ) π ˆig = P(ci = g | yi , xi ; θ) G ˆ ˆ l=1 P(ci = l | xi ; θ)f (yi | ci = l, xi ; θ) → used for prognostic tools
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Prediction : prognostic /early detection tools H i(s): marker information until time s
Marker /latent process evolution
probability of event? s
s+t
age
Predicted probability of event in (s,s+t) : ˆ = P(Ti ≤ s + t | Ti > s, Hi (s), Xi ; θ) =
G X g=1
ˆ × P(ci = g | Hi (s), Xi , Ti > s; θ) ˆ P(Ti ≤ s + t | ci = g, Ti > s, Xi ; θ) | {z } ys π ˆ ig
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Profiles of semantic memory decline associated with onset of Alzheimer’s disease (AD) in the elderly - Longitudinal outcomes : 2 measures of semantic memory → 1 ordinal similarities test (WST- scale 0-10) → 1 discrete quantitative fluency test (IST- scale 0-40) - Time-to-event : age at onset of AD → truncated data : entry in the cohort at age>65 - Binary covariates : education, gender - Subsample from a French cohort on aging (PAQUID) : N=2484 → followed-up during 14 years → 417 (16.8%) incident AD
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Distribution of the tests A
25
6 frequency (%)
frequency (%)
20 15 10 5 0
B
7
5 4 3 2 1
0
2
4
6 WST
8
10
0
0
5
10
15
20 IST
25
30
→ Median of 3 (IQR=[1,5]) repeated measures for IST → Median of 4 (IQR=[2,6]) repeated measures for WST
35
40
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Predicted mean evolution of the latent process and probability of being free of dementia A
0 -2 -4 -6 class 1 class 2 class 3
-8 -10
65
70
75 80 age (in years)
B
1 probability of being free of AD
latent semantic memory
2
85
Predicted mean evolution of the latent process in each class
90
0.8 0.6 0.4 0.2 0
65
70
75 80 age (in years)
85
90
Predicted probability of being free of AD in each class
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Predicted transformations of the markers 40
10
IST 35 WST
8
30 25
6
20 4
15 10
2
5 0
-15
-10
-5
0
latent semantic memory
5
0
WST
IST
Context
Conclusion
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Goodness-of-fit : class-specific marginal predictions 1 - For an ordinal outcome (k = 1, ..., K 1 ) : ˆ ci = g) = ˆyijk |ci =g = E(yijk |θ;
CX k −1
ˆ ci = g) l × P(ηlk ≤ ˜yijk < η(l+1)k |θ;
l=0
= Ck − 1 −
CX k −2
ˆ ci = g) P(˜yijk < η(l+1)k |θ;
l=0
- For a quantitative outcome (k = K1 + 1, ..., K) : ˆ ci = g) ˆyijk |ci =g = E(Hk−1 (˜yijk ; ηˆk )|θ; → numerical integration of h−1 yijk ; ηˆk ) over the k (˜ multivariate Gaussian distribution of ˜y ik |ci =g .
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Class-specific marginal predictions in the test scale WST Class 2
8
8
8
6
6
6
4
4
2
2
0
0
65
70
75
80
85
90
95
100
WST
10
4 2
65
70
75
age (year)
80
85
90
95
0
100
30
30
25
25
25
20
20
20
IST
30 IST
35
15
15
15
10
10
10
5
5 75
80
85
age (year)
90
95
100
0
80
85
90
95
100
90
95
100
IST Class 3 40
35
70
75
IST Class 2 40
35
65
70
age (year)
40
0
65
age (year)
IST Class 1
IST
WST Class 3
10
WST
WST
WST Class 1 10
5 65
70
75
80
85
age (year)
90
95
100
0
65
70
75
80
85
age (year)
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Goodness-of-fit : table of posterior classification Final classif.
Number of subjects (%)
Mean of the class-membership probabilities in class : 1 2 3
1
2074 (83.5%)
82.9
2.3
14.8
2
142 (5.7%)
8.7
78.3
13.0
3
268 (10.8%)
18.5
7.7
73.8
→ unambiguous posterior classification
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
40
0.8
35
0.7
30
0.6
25
0.5
20
0.4
15
0.3
10
0.2
5
0.1
0
74
76
78
80
82
age (years)
84
86
88
0
probability of AD
scores
Dynamic predictive tool of AD Probability of dementia in 5 years updated every 3 years
Diagnosed at 87 years old x IST • WST + Prediction with 95%CI
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
Concluding remarks Advantages of the model : - several markers (latent process part) → avoids biases due to nonlinearity + ordinal scales → increases the power of the analyses - time-to-event (joint model part) → avoids the selection biases - latent class approach → explicit interpretation of the association + heterogeneity Possible applications : - describe the natural history of a disease - evaluate risk factors, treatments, ... - develop tools for early detection/prognosis
Context
Latent variables
Latent process model
Joint latent class model
Estimation
Application
Conclusion
References -
Hambleton R., Swaninathan H. & Rogers H. (1991). Fundamentals of item response theory. Newbury Park, CA : Sage.
- Lin H., Turnbull B.W. et al. (2002). Latent class models for joint analysis of longitudinal biomarker and event process data : application to longitudinal prostate-specific antigen readings and prostate cancer, Journal of the American Statistical Association, 97,53-65 - Proust C., Jacqmin-Gadda H. et al. (2006). A nonlinear model with latent process for cognitive evolution using multivariate longitudinal data, Biometrics, 62,1014-24 - Proust-Lima C., Amieva H. et al. (2007). Properties of 4 psychometric tests to measure cognitive changes in brain aging population based-studies, American Journal of Epidemiology, 165, 344-50 - Proust-Lima C., Joly P. et al. (2008). Joint modelling of multivariate longitudinal outcomes & a time-to-event : a nonlinear latent class approach, Computational Statistics & Data Analysis, 53, 1142-54 - Roy J., & Lin X. (2000). Latent variable models for longitudinal data with multiple continuous outcomes. Biometrics, 56, 1047-1054.