DIC and other model selection criteria in risk mapping Ioana Molnar1 in collaboration with Sylvain Coly1,2 , Myriam Charras-Garrido1 , David Abrial1 et Anne-Fran¸coise Yao-Lafourcade2 ´ emiologie Animale (EPIA), Centre INRA Auvergne – Rhˆ 1 : Unit´ e d’Epid´ one-Alpes 2 : Laboratoire de Math´ ematiques, Universit´ e Blaise Pascal, Clermont-Ferrand
48`emes Journ´ees de Statistique 30 may – 3 june 2016
Montpellier, 31 may, 2016 Ioana Molnar
DIC and other model selection criteria in risk mapping
1 / 15
Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods
Ioana Molnar
DIC and other model selection criteria in risk mapping
2 / 15
Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •
case data population data parametric distribution of cases risk structure
22
0
Ioana Molnar
DIC and other model selection criteria in risk mapping
2 / 15
Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •
case data population data parametric distribution of cases risk structure
22
4147
0
1
Ioana Molnar
DIC and other model selection criteria in risk mapping
2 / 15
Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •
case data population data parametric distribution of cases risk structure
22
4147
0
1
Ioana Molnar
DIC and other model selection criteria in risk mapping
2 / 15
Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •
case data population data parametric distribution of cases risk structure
• Risk representation: smooth maps
22
4147
131.00
0
1
0.11
Ioana Molnar
DIC and other model selection criteria in risk mapping
2 / 15
Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing • has a good explicative (and predictive) power
• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones
• easily implementable, easy-to-use • robust Ioana Molnar
DIC and other model selection criteria in risk mapping
3 / 15
Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing
}
frequently related
• has a good explicative (and predictive) power
• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones
• easily implementable, easy-to-use • robust
Ioana Molnar
DIC and other model selection criteria in risk mapping
3 / 15
Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing
}
frequently related
• has a good explicative (and predictive) power
• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones
• easily implementable, easy-to-use • robust
Ioana Molnar
DIC and other model selection criteria in risk mapping
3 / 15
The DIC in risk mapping DIC = Deviance Information Criterion (Spiegelhalter et al., 2002)
_
_
_ based on the Bayesian deviance D(θ) := −2 ln p(Y | θ) _ measures the inadequacy penalized by the complexity D(θ) pD := D(θ) − D( θ ) effectif # of parameters Definition: DIC := 2D(θ) − D( θ ). Advantages
Shortcomings
• No need to compute the number of parameters
• pD sometimes negative
• Easy computation using MCMC samples
◦ Lacks invariance to re-parametrization, and others
• Lack of a strong competitor
Ioana Molnar
• Favors overfitted models
DIC and other model selection criteria in risk mapping
4 / 15
The DIC in risk mapping DIC = Deviance Information Criterion (Spiegelhalter et al., 2002)
_
_
_ based on the Bayesian deviance D(θ) := −2 ln p(Y | θ) _ measures the inadequacy penalized by the complexity D(θ) pD := D(θ) − D( θ ) effectif # of parameters Definition: DIC := 2D(θ) − D( θ ). Advantages
Shortcomings
• No need to compute the number of parameters
• pD sometimes negative
• Easy computation using MCMC samples
◦ Lacks invariance to re-parametrization, and others
• Lack of a strong competitor
Ioana Molnar
• Favors overfitted models
DIC and other model selection criteria in risk mapping
4 / 15
Context of study of the DIC: the data Two types of data:
_
real data of bovine tuberculosis in France: overdispersed
_ 2001
2002
simulated data from a known risk: by BN 2003
2004
2005
22.00
2001
2002
2003
2004
2005
2007
2008
2009
2010
13.00
0.00
0.78 2006
2007
Ioana Molnar
2008
2009
2010
2006
DIC and other model selection criteria in risk mapping
5 / 15
Context of study of the DIC: the data Two types of data:
_
real data of bovine tuberculosis in France: overdispersed
_ 2001
2002
simulated data from a known risk: by BN 2003
2004
2005
2001
2002
1 22.00
13.00
0.00
2003
1
2004
1
2005
1
1
0.78 2006
2007
2008
2009
2010
2006
2007
1
Ioana Molnar
2008
1
DIC and other model selection criteria in risk mapping
2009
1
2010
1
1
5 / 15
Context of study of the DIC: the data Two types of data:
_
real data of bovine tuberculosis in France: overdispersed
_ 2001
2002
simulated data from a known risk: by BN 2003
2004
2005
2001
2002
2 22.00
2003
2
2004
2
2005
2
2
13.00
0.00
0.78 2006
2007
2008
2009
2010
2006
2007
2
Ioana Molnar
2008
2
DIC and other model selection criteria in risk mapping
2009
2
2010
2
2
5 / 15
Context of study of the DIC: the data Two types of data:
_
real data of bovine tuberculosis in France: overdispersed
_ 2001
2002
simulated data from a known risk: by BN 2003
2004
2005
2001
22.00
13.00
0.00
0.78
2002
2003
3 2006
2007
2008
2009
2010
2006
2007
3
Ioana Molnar
2004
3 2008
3
2005
3 2009
3
DIC and other model selection criteria in risk mapping
3 2010
3
5 / 15
Context of study of the DIC: the data Two types of data:
_
real data of bovine tuberculosis in France: overdispersed
_ 2001
2002
simulated data from a known risk: by BN 2003
2004
2005
22.00
2001
2002
2003
2004
2005
2007
2008
2009
2010
13.00
0.00
0.78 2006
2007
2008
2009
2010
2006
Goal: select the models for which the risk maps: _ fit the data
_ are smooth enough – consistence with Moran’s I?
_ show well-delimited structures
_ are close to the real underlying risk – consistence with MSE / Spearman’s ρ? Ioana Molnar
DIC and other model selection criteria in risk mapping
5 / 15
Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models
Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases: Yij ∼ P(λji )
or
Yij ∼ P(λji ) λji ∼ γ(αji , βji )
Parameter of interest: λji ≈ relative risk Rji . nd • 2 Level – relative risk structure:
ln Rji =
a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶
spatial component
+
b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶
temporal component
c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶
+
Uji , Tij , Vij _ CAR-type process a, b, c, d _ weights: a, b, c ∈ {0, 1}
spatio-temporal component
or
+
d · εji ´¹¹ ¹¸ ¹ ¹ ¶
white noise
a, b, c ∼ γ(5, 5), d ∈ {0, 1}.
• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar
DIC and other model selection criteria in risk mapping
6 / 15
Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models
Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases: Yij ∼ P(λji )
or
Yij ∼ P(λji ) λji ∼ γ(αji , βji )
Parameter of interest: λji ≈ relative risk Rji . nd • 2 Level – relative risk structure:
ln Rji =
a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶
spatial component
+
b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶
temporal component
c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶
+
Uji , Tij , Vij _ CAR-type process a, b, c, d _ weights: a, b, c ∈ {0, 1}
spatio-temporal component
or
+
d · εji ´¹¹ ¹¸ ¹ ¹ ¶
white noise
a, b, c ∼ γ(5, 5), d ∈ {0, 1}.
• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar
DIC and other model selection criteria in risk mapping
6 / 15
Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models
Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases: Yij ∼ P(λji )
or
Yij ∼ P(λji ) λji ∼ γ(αji , βji )
Parameter of interest: λji ≈ relative risk Rji . nd • 2 Level – relative risk structure:
ln Rji =
a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶
spatial component
+
b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶
temporal component
c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶
+
Uji , Tij , Vij _ CAR-type process a, b, c, d _ weights: a, b, c ∈ {0, 1}
spatio-temporal component
or
+
d · εji ´¹¹ ¹¸ ¹ ¹ ¶
white noise
a, b, c ∼ γ(5, 5), d ∈ {0, 1}.
• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar
DIC and other model selection criteria in risk mapping
6 / 15
Context of study of the DIC: the models _ spatio-temporal, hierarchical Bayesian models i three-level i i T
cases i and jperiod Data: Yij = # ofj − 1 in region j +1 j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: i i i i k3
k2
k3
• 1 Level – the distribution of the cases:
ik 3 ik 4
k2
ik2 ik1
ik
st
ik 4
V
ik1
ik
ik5
ik6
ik
Yij ∼ P(λji )
or
ik 4
ikj ∼ iP(λ k1 j ) Y i i j j j γ(α ik5 λi i∼ k6 i , βi )
ik 5
ik 6
U
j j Parameter j+ j − 1 of interest: λij ≈ relative risk R i. 1
nd • 2 Level – relative risk structure:
ln Rji =
a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶
spatial component
+
b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶
temporal component
c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶
+
Uji , Tij , Vij _ CAR-type process a, b, c, d _ weights: a, b, c ∈ {0, 1}
spatio-temporal component
or
+
d · εji ´¹¹ ¹¸ ¹ ¹ ¶
white noise
a, b, c ∼ γ(5, 5), d ∈ {0, 1}.
• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar
DIC and other model selection criteria in risk mapping
6 / 15
Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models
Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases: Yij ∼ P(λji )
or
Yij ∼ P(λji ) λji ∼ γ(αji , βji )
Parameter of interest: λji ≈ relative risk Rji . nd • 2 Level – relative risk structure:
ln Rji =
a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶
spatial component
+
b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶
temporal component
c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶
+
Uji , Tij , Vij _ CAR-type process a, b, c, d _ weights: a, b, c ∈ {0, 1}
spatio-temporal component
or
+
d · εji ´¹¹ ¹¸ ¹ ¹ ¶
white noise
a, b, c ∼ γ(5, 5), d ∈ {0, 1}.
• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar
DIC and other model selection criteria in risk mapping
6 / 15
Estimation of the parameters
λji ∼ P or λji ∼ BN ln Rji = aUji + bVij + cTij + εji
# of models: 60, depending on the choice of _ the 1st level distribution: P or BN.
_ the random effects included aU, bT, cV and / or ε.
# of data sets: 1 (real data); 100 (simulated data). # of estimation replications: 100 (real data); 1 (simulated data). # of parameters: up to ≈ 22 400, depending on the model.
Estimation: by MCMC under OpenBUGS.
τ1
τ2
τ3
τ4
U
T
V
ε
c
b β
BUGS parameters: random seed: from 1 to 14 (real data)
a
α
λ λ (BN)
(P)
burn-in step: from 10 000 to 80 000 (real data)
Y
thinning step: 10 sample length: 10 000. Ioana Molnar
DIC and other model selection criteria in risk mapping
7 / 15
The DIC for the simulated data ●
●
sce. 1
●
8000 10000
20000
Boxplots of the values of the DIC and of the effective number of prameters pD :
DIC
pD
●
●
sce. 1
●
●
● ● ● ●
● ● ●
● ● ●●
● ● ● ● ● ● ●
●
●
●
●
●
●
●
●●●●
●
●
●●●
●
●
●
● ●
●
● ● ●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
6000
●
● ●
●
● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●●
●
● ● ● ●
● ● ●
●
●
●
● ●
● ● ●
● ●
● ● ●
●
●
●
● ●● ● ● ● ●● ● ●●●● ●● ●●● ● ●●● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●
● ● ● ●
● ●
● ● ●
● ● ● ●
Comments: _ Identical values for 16 of the models (those with no weights nor noise). 2002
2003
2004
2005
95.21
2001
2002
2003
2004
2005
2008
2009
2010
2006
2002
2003
2004
2005
2007
2008
2009
2010
0.50
0.00 2007
2001
3.22
289.78
0.19 2006
● ● ●
●
●
4000
●
●
●
2000
●
●
●
0
15000
●
10000 5000 2001
● ●
●● ● ●● ● ●● ●● ● ●●●● ●● ● ● ●●● ●● ●●● ● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ●
●
2007
2008
2009
2010
2006
(P, U) (P, U, T , V) (BN, U, V) _ Negative values for pD for several models, null values for many others. Ioana Molnar
DIC and other model selection criteria in risk mapping
8 / 15
The DIC for the simulated data ●
●
sce. 1
●
8000 10000
20000
Boxplots of the values of the DIC and of the effective number of prameters pD :
DIC
pD
●
●
sce. 1
●
●
● ● ● ●
● ● ●
● ● ●●
● ● ● ● ● ● ●
●
●
●
●
●
●
●
●●●●
●
●
●●●
●
●
●
● ●
●
● ● ●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
6000
●
● ●
●
● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●●
●
● ● ● ●
● ● ●
●
●
●
● ●
● ● ●
● ●
● ● ●
●
●
●
● ●● ● ● ● ●● ● ●●●● ●● ●●● ● ●●● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●
● ● ● ●
● ●
● ● ●
● ● ● ●
mean over replicats Comments: _ Identical values for 16 of the models (those with no weights nor noise). 2002
2003
2004
2005
4480.00
2001
2002
2003
2004
2005
4480.00
1.00 2006
● ● ●
●
●
4000
●
●
●
2000
●
●
●
0
15000
●
10000 5000 2001
● ●
●● ● ●● ● ●● ●● ● ●●●● ●● ● ● ●●● ●● ●●● ● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ●
●
1.00 2007
2008
2009
2010
2006
2001
2002
2003
2004
2005
2007
2008
2009
2010
4480.00
1.00 2007
2008
2009
2010
2006
(P, U)∗ (P, U, T , V)∗ (BN, U, V)∗ _ Negative values for pD for several models, null values for many others. Ioana Molnar
DIC and other model selection criteria in risk mapping
8 / 15
Other classical comparison methods • Accuracy measures (MSE related)
{
• Spearman’s rank correlation coefficient ρ _ _ _ _
applied on the relative risk (not the count data) may be used only on simulation studies measure the adequacy of the model miss the notion of parsimony
• Spatio-temporal association indicators (Moran’s I, Geary’s c)
∗ ∗ ∗
_ standard spatial association indices, extended to the spatio-temporal context _ may be used on real data _ evaluate the smoothness of the risk maps _ lack the idea of adjustment
Naturally, they won’t favor the same models as the DIC (they have different purposes) However, we can use them to evaluate some aspects of the DIC’s performance in choosing “good” models. They can be used to set up a better selection tool.
Ioana Molnar
DIC and other model selection criteria in risk mapping
9 / 15
The DIC vs. Spearman’s ρ (simulated data) Spearman’s rho: _ used to evaluate the performance of the DIC (even if different purposes) ● ●
smallest DIC value second smallest DIC value
sce. 1
strong correlation significantly non−null
●
U T V ε
0.65 0.60
●●●●
0.55
●●
Spearman's coefficient
0.50 0.45 0.40
●
●
●
0.35
●
●
0.30
●
●
0.25 0.20
●
0.15
●
0.10
●
●
●
●
● ●
● ●
0.05 0.00
●
●
∗ Models with good DIC values show highly variable ρ’s. Ioana Molnar
DIC and other model selection criteria in risk mapping
10 / 15
The DIC vs. Spearman’s ρ (simulated data) Spearman’s rho: _ used to evaluate the performance of the DIC (even if different purposes) ● ●
smallest DIC value second smallest DIC value
0.75
●
0.70
●
●
●
U T V ε
●
0.65
●
0.60
Spearman's coefficient
sce. 2
strong correlation significantly non−null
●
●
●
0.55
●
0.50
●
● ●
0.45 0.40
●
0.35
●
●
●
0.30 0.25
●
●●
●
●
● ●
●●
0.20 0.15 0.10
● ●
● ●
●
∗ Models with good DIC values show highly variable ρ’s. Ioana Molnar
DIC and other model selection criteria in risk mapping
10 / 15
Proposition of new selection methods • Criteria coupling: C(C1 , C2 ) _ Filters out models that are “bad” according to C1 , and selects one good model according to C2 _ Order matters _ Easy to apply, but needs some post-hoc analysis
• A new deviance criterion: Smoothness Deviance Criterion (SDC)
_ Replaces the penalization by the complexity measure pD with a penalization for the lack of smoothness 1 SDC = ( D(θ) + ξ ) 1 + , I · 1[0,1] where ξ > 0 s.t. D(θ) + ξ > 0 for all compared models. _ Multiplicative penalization =⇒ the hierarchy defined by the goodness of fit measure D(θ) is modified only if the models have very poor smoothness coefficient _ The parsimony idea is contained in I _ Easy to implement Ioana Molnar
DIC and other model selection criteria in risk mapping
11 / 15
Application of the new methods on the simulated data For both scenarios, the model selected via the coupling C(Moran’s I, DIC) and that selected via the SDC are the same. • For scenario 1 (data generated by BN): model BN, V. 2001
2002
2003
2004
2005
2007
2008
2009
2010
4480.00
1.00 2006
• For scenario 2 (data generated by P): model P, U. 2001
2002
2003
2004
2005
2007
2008
2009
2010
4480.00
1.00 2006
Ioana Molnar
DIC and other model selection criteria in risk mapping
12 / 15
The DIC for the real data Boxplots of the values of the DIC:
●
●
●
●
●
4000
●
●
●
●
●
● ● ●
●
●
●
●
● ●
● ●
●
3100
2000
3200
P models ●
● ● ● ●
●
0
● ● ● ● ●
3000
● ●
●
● ● ●
● ●
−2000
● ●
BN models
●
_ Smallest DIC values obtained for several models (all of them P models). Example: (P, U, T , V). 2001
2002
2003
2004
2005
2007
2008
2009
2010
852.38
0.04 2006
Ioana Molnar
DIC and other model selection criteria in risk mapping
13 / 15
●
Application of the new methods on the real data The models selected via the coupling C(DIC, Moran’s I), and that selected via the SDIC are different, but very similar. • C(DIC, Moran’s I): model (BN, U, V). 2001
2002
2003
2004
2005
2007
2008
2009
2010
2002
2003
2004
2005
2007
2008
2009
2010
186.60
0.10 2006
• SDC: model (BN, U). 2001
177.69
0.11 2006
Ioana Molnar
DIC and other model selection criteria in risk mapping
14 / 15
Conclusion and perspectives Summary: • Weaknesses of the DIC specific to disease mapping: • lack of smoothness of the risk maps • identical ranking of models producing very different risk maps • too much divergence between the DIC and the accuracy measures • The systematic application of the DIC in disease mapping is not justified. • The modification of the DIC to take into account the smoothness (crucial point in risk mapping) through Criteria Coupling and SDC shows promising results. Some work to follow: • confirm the utility of the proposed models on different data sets and simulation scenarios • investigate the properties of the proposed method (robustness)
Ioana Molnar
DIC and other model selection criteria in risk mapping
15 / 15
Conclusion and perspectives Summary: • Weaknesses of the DIC specific to disease mapping: • lack of smoothness of the risk maps • identical ranking of models producing very different risk maps • too much divergence between the DIC and the accuracy measures • The systematic application of the DIC in disease mapping is not justified. • The modification of the DIC to take into account the smoothness (crucial point in risk mapping) through Criteria Coupling and SDC shows promising results. Some work to follow: • confirm the utility of the proposed models on different data sets and simulation scenarios • investigate the properties of the proposed method (robustness)
Thank you! Ioana Molnar
DIC and other model selection criteria in risk mapping
15 / 15