DIC and other model selection criteria in risk mapping - Ioana MOLNAR

Risk estimation: Bayesian methods. Ioana Molnar. DIC and other .... 3rd Level – specification of the precision parameters of the normal distributions of εj i,Uj i,Tj.
2MB taille 2 téléchargements 158 vues
DIC and other model selection criteria in risk mapping Ioana Molnar1 in collaboration with Sylvain Coly1,2 , Myriam Charras-Garrido1 , David Abrial1 et Anne-Fran¸coise Yao-Lafourcade2 ´ emiologie Animale (EPIA), Centre INRA Auvergne – Rhˆ 1 : Unit´ e d’Epid´ one-Alpes 2 : Laboratoire de Math´ ematiques, Universit´ e Blaise Pascal, Clermont-Ferrand

48`emes Journ´ees de Statistique 30 may – 3 june 2016

Montpellier, 31 may, 2016 Ioana Molnar

DIC and other model selection criteria in risk mapping

1 / 15

Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods

Ioana Molnar

DIC and other model selection criteria in risk mapping

2 / 15

Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •

case data population data parametric distribution of cases risk structure

22

0

Ioana Molnar

DIC and other model selection criteria in risk mapping

2 / 15

Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •

case data population data parametric distribution of cases risk structure

22

4147

0

1

Ioana Molnar

DIC and other model selection criteria in risk mapping

2 / 15

Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •

case data population data parametric distribution of cases risk structure

22

4147

0

1

Ioana Molnar

DIC and other model selection criteria in risk mapping

2 / 15

Disease risk mapping _ tool used in spatial statistics for the analysis of the risk underlying the observed incidence of a disease. • Risk estimation: Bayesian methods • A priori information: • • • •

case data population data parametric distribution of cases risk structure

• Risk representation: smooth maps

22

4147

131.00

0

1

0.11

Ioana Molnar

DIC and other model selection criteria in risk mapping

2 / 15

Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing • has a good explicative (and predictive) power

• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones

• easily implementable, easy-to-use • robust Ioana Molnar

DIC and other model selection criteria in risk mapping

3 / 15

Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing

}

frequently related

• has a good explicative (and predictive) power

• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones

• easily implementable, easy-to-use • robust

Ioana Molnar

DIC and other model selection criteria in risk mapping

3 / 15

Model selection in risk mapping • Properties of a good model? • fits well on the data • is parsimonious • performs data smoothing

}

frequently related

• has a good explicative (and predictive) power

• Properties of a good model selection criterion? • identifies the best models • filters out the unsuitable models • ranks the pertinent ones

• easily implementable, easy-to-use • robust

Ioana Molnar

DIC and other model selection criteria in risk mapping

3 / 15

The DIC in risk mapping DIC = Deviance Information Criterion (Spiegelhalter et al., 2002)

_

_

_ based on the Bayesian deviance D(θ) := −2 ln p(Y | θ) _ measures the inadequacy penalized by the complexity D(θ) pD := D(θ) − D( θ ) effectif # of parameters Definition: DIC := 2D(θ) − D( θ ). Advantages

Shortcomings

• No need to compute the number of parameters

• pD sometimes negative

• Easy computation using MCMC samples

◦ Lacks invariance to re-parametrization, and others

• Lack of a strong competitor

Ioana Molnar

• Favors overfitted models

DIC and other model selection criteria in risk mapping

4 / 15

The DIC in risk mapping DIC = Deviance Information Criterion (Spiegelhalter et al., 2002)

_

_

_ based on the Bayesian deviance D(θ) := −2 ln p(Y | θ) _ measures the inadequacy penalized by the complexity D(θ) pD := D(θ) − D( θ ) effectif # of parameters Definition: DIC := 2D(θ) − D( θ ). Advantages

Shortcomings

• No need to compute the number of parameters

• pD sometimes negative

• Easy computation using MCMC samples

◦ Lacks invariance to re-parametrization, and others

• Lack of a strong competitor

Ioana Molnar

• Favors overfitted models

DIC and other model selection criteria in risk mapping

4 / 15

Context of study of the DIC: the data Two types of data:

_

real data of bovine tuberculosis in France: overdispersed

_ 2001

2002

simulated data from a known risk: by BN 2003

2004

2005

22.00

2001

2002

2003

2004

2005

2007

2008

2009

2010

13.00

0.00

0.78 2006

2007

Ioana Molnar

2008

2009

2010

2006

DIC and other model selection criteria in risk mapping

5 / 15

Context of study of the DIC: the data Two types of data:

_

real data of bovine tuberculosis in France: overdispersed

_ 2001

2002

simulated data from a known risk: by BN 2003

2004

2005

2001

2002

1 22.00

13.00

0.00

2003

1

2004

1

2005

1

1

0.78 2006

2007

2008

2009

2010

2006

2007

1

Ioana Molnar

2008

1

DIC and other model selection criteria in risk mapping

2009

1

2010

1

1

5 / 15

Context of study of the DIC: the data Two types of data:

_

real data of bovine tuberculosis in France: overdispersed

_ 2001

2002

simulated data from a known risk: by BN 2003

2004

2005

2001

2002

2 22.00

2003

2

2004

2

2005

2

2

13.00

0.00

0.78 2006

2007

2008

2009

2010

2006

2007

2

Ioana Molnar

2008

2

DIC and other model selection criteria in risk mapping

2009

2

2010

2

2

5 / 15

Context of study of the DIC: the data Two types of data:

_

real data of bovine tuberculosis in France: overdispersed

_ 2001

2002

simulated data from a known risk: by BN 2003

2004

2005

2001

22.00

13.00

0.00

0.78

2002

2003

3 2006

2007

2008

2009

2010

2006

2007

3

Ioana Molnar

2004

3 2008

3

2005

3 2009

3

DIC and other model selection criteria in risk mapping

3 2010

3

5 / 15

Context of study of the DIC: the data Two types of data:

_

real data of bovine tuberculosis in France: overdispersed

_ 2001

2002

simulated data from a known risk: by BN 2003

2004

2005

22.00

2001

2002

2003

2004

2005

2007

2008

2009

2010

13.00

0.00

0.78 2006

2007

2008

2009

2010

2006

Goal: select the models for which the risk maps: _ fit the data

_ are smooth enough – consistence with Moran’s I?

_ show well-delimited structures

_ are close to the real underlying risk – consistence with MSE / Spearman’s ρ? Ioana Molnar

DIC and other model selection criteria in risk mapping

5 / 15

Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models

Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases:  Yij ∼ P(λji )

or

Yij ∼ P(λji ) λji ∼ γ(αji , βji )

Parameter of interest: λji ≈ relative risk Rji . nd • 2  Level – relative risk structure:

 ln Rji =     

a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶

spatial component

+

b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶

temporal component

c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶

+

   Uji , Tij , Vij _ CAR-type process    a, b, c, d _ weights: a, b, c ∈ {0, 1}

spatio-temporal component

or

+

d · εji ´¹¹ ¹¸ ¹ ¹ ¶

white noise

a, b, c ∼ γ(5, 5), d ∈ {0, 1}.

• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar

DIC and other model selection criteria in risk mapping

6 / 15

Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models

Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases:  Yij ∼ P(λji )

or

Yij ∼ P(λji ) λji ∼ γ(αji , βji )

Parameter of interest: λji ≈ relative risk Rji . nd • 2  Level – relative risk structure:

 ln Rji =     

a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶

spatial component

+

b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶

temporal component

c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶

+

   Uji , Tij , Vij _ CAR-type process    a, b, c, d _ weights: a, b, c ∈ {0, 1}

spatio-temporal component

or

+

d · εji ´¹¹ ¹¸ ¹ ¹ ¶

white noise

a, b, c ∼ γ(5, 5), d ∈ {0, 1}.

• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar

DIC and other model selection criteria in risk mapping

6 / 15

Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models

Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases:  Yij ∼ P(λji )

or

Yij ∼ P(λji ) λji ∼ γ(αji , βji )

Parameter of interest: λji ≈ relative risk Rji . nd • 2  Level – relative risk structure:

 ln Rji =     

a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶

spatial component

+

b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶

temporal component

c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶

+

   Uji , Tij , Vij _ CAR-type process    a, b, c, d _ weights: a, b, c ∈ {0, 1}

spatio-temporal component

or

+

d · εji ´¹¹ ¹¸ ¹ ¹ ¶

white noise

a, b, c ∼ γ(5, 5), d ∈ {0, 1}.

• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar

DIC and other model selection criteria in risk mapping

6 / 15

Context of study of the DIC: the models _ spatio-temporal, hierarchical Bayesian models i three-level i i T

cases i and jperiod Data: Yij = # ofj − 1 in region j +1 j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: i i i i k3

k2

k3

• 1 Level – the distribution of the cases: 

ik 3 ik 4

k2

ik2 ik1

ik

st

ik 4

V

ik1

ik

ik5

ik6

ik

Yij ∼ P(λji )

or

ik 4

ikj ∼ iP(λ k1 j ) Y i i j j j γ(α ik5 λi i∼ k6 i , βi )

ik 5

ik 6

U

j j Parameter j+ j − 1 of interest: λij ≈ relative risk R i. 1

nd • 2  Level – relative risk structure:

 ln Rji =     

a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶

spatial component

+

b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶

temporal component

c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶

+

   Uji , Tij , Vij _ CAR-type process    a, b, c, d _ weights: a, b, c ∈ {0, 1}

spatio-temporal component

or

+

d · εji ´¹¹ ¹¸ ¹ ¹ ¶

white noise

a, b, c ∼ γ(5, 5), d ∈ {0, 1}.

• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar

DIC and other model selection criteria in risk mapping

6 / 15

Context of study of the DIC: the models _ spatio-temporal, three-level hierarchical Bayesian models

Data: Yij = # of cases in region i and period j i = 1, . . . , n := 448, j = 1, . . . , m := 10. Modeling: • 1st Level – the distribution of the cases:  Yij ∼ P(λji )

or

Yij ∼ P(λji ) λji ∼ γ(αji , βji )

Parameter of interest: λji ≈ relative risk Rji . nd • 2  Level – relative risk structure:

 ln Rji =     

a · Uji ´¹¹ ¹ ¹ ¸¹¹ ¹ ¹ ¶

spatial component

+

b · Tij ´¹¹ ¹ ¸¹¹ ¹ ¶

temporal component

c · Vij ´¹¹ ¹ ¸ ¹ ¹ ¶

+

   Uji , Tij , Vij _ CAR-type process    a, b, c, d _ weights: a, b, c ∈ {0, 1}

spatio-temporal component

or

+

d · εji ´¹¹ ¹¸ ¹ ¹ ¶

white noise

a, b, c ∼ γ(5, 5), d ∈ {0, 1}.

• 3rd Level – specification of the precision parameters of the normal distributions of εji , Uji , Tij , Vij : γ(0.01, 0.01). Ioana Molnar

DIC and other model selection criteria in risk mapping

6 / 15



Estimation of the parameters

λji ∼ P or λji ∼ BN ln Rji = aUji + bVij + cTij + εji

# of models: 60, depending on the choice of _ the 1st level distribution: P or BN.

_ the random effects included aU, bT, cV and / or ε.

# of data sets: 1 (real data); 100 (simulated data). # of estimation replications: 100 (real data); 1 (simulated data). # of parameters: up to ≈ 22 400, depending on the model.

Estimation: by MCMC under OpenBUGS.

τ1

τ2

τ3

τ4

U

T

V

ε

c

b β

BUGS parameters: random seed: from 1 to 14 (real data)

a

α

λ λ (BN)

(P)

burn-in step: from 10 000 to 80 000 (real data)

Y

thinning step: 10 sample length: 10 000. Ioana Molnar

DIC and other model selection criteria in risk mapping

7 / 15

The DIC for the simulated data ●



sce. 1



8000 10000

20000

Boxplots of the values of the DIC and of the effective number of prameters pD :

DIC

pD





sce. 1





● ● ● ●

● ● ●

● ● ●●

● ● ● ● ● ● ●















●●●●





●●●







● ●



● ● ●

● ●

● ●































●●

● ●

6000



● ●



● ● ● ● ●

● ● ● ●

● ● ● ● ● ●

● ●●



● ● ● ●

● ● ●







● ●

● ● ●

● ●

● ● ●







● ●● ● ● ● ●● ● ●●●● ●● ●●● ● ●●● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●

● ● ● ●

● ●

● ● ●

● ● ● ●

Comments: _ Identical values for 16 of the models (those with no weights nor noise). 2002

2003

2004

2005

95.21

2001

2002

2003

2004

2005

2008

2009

2010

2006

2002

2003

2004

2005

2007

2008

2009

2010

0.50

0.00 2007

2001

3.22

289.78

0.19 2006

● ● ●





4000







2000







0

15000



10000 5000 2001

● ●

●● ● ●● ● ●● ●● ● ●●●● ●● ● ● ●●● ●● ●●● ● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ●



2007

2008

2009

2010

2006

(P, U) (P, U, T , V) (BN, U, V) _ Negative values for pD for several models, null values for many others. Ioana Molnar

DIC and other model selection criteria in risk mapping

8 / 15

The DIC for the simulated data ●



sce. 1



8000 10000

20000

Boxplots of the values of the DIC and of the effective number of prameters pD :

DIC

pD





sce. 1





● ● ● ●

● ● ●

● ● ●●

● ● ● ● ● ● ●















●●●●





●●●







● ●



● ● ●

● ●

● ●































●●

● ●

6000



● ●



● ● ● ● ●

● ● ● ●

● ● ● ● ● ●

● ●●



● ● ● ●

● ● ●







● ●

● ● ●

● ●

● ● ●







● ●● ● ● ● ●● ● ●●●● ●● ●●● ● ●●● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●

● ● ● ●

● ●

● ● ●

● ● ● ●

mean over replicats Comments: _ Identical values for 16 of the models (those with no weights nor noise). 2002

2003

2004

2005

4480.00

2001

2002

2003

2004

2005

4480.00

1.00 2006

● ● ●





4000







2000







0

15000



10000 5000 2001

● ●

●● ● ●● ● ●● ●● ● ●●●● ●● ● ● ●●● ●● ●●● ● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ●



1.00 2007

2008

2009

2010

2006

2001

2002

2003

2004

2005

2007

2008

2009

2010

4480.00

1.00 2007

2008

2009

2010

2006

(P, U)∗ (P, U, T , V)∗ (BN, U, V)∗ _ Negative values for pD for several models, null values for many others. Ioana Molnar

DIC and other model selection criteria in risk mapping

8 / 15

Other classical comparison methods • Accuracy measures (MSE related)

{

• Spearman’s rank correlation coefficient ρ _ _ _ _

applied on the relative risk (not the count data) may be used only on simulation studies measure the adequacy of the model miss the notion of parsimony

• Spatio-temporal association indicators (Moran’s I, Geary’s c)

∗ ∗ ∗

_ standard spatial association indices, extended to the spatio-temporal context _ may be used on real data _ evaluate the smoothness of the risk maps _ lack the idea of adjustment

Naturally, they won’t favor the same models as the DIC (they have different purposes) However, we can use them to evaluate some aspects of the DIC’s performance in choosing “good” models. They can be used to set up a better selection tool.

Ioana Molnar

DIC and other model selection criteria in risk mapping

9 / 15

The DIC vs. Spearman’s ρ (simulated data) Spearman’s rho: _ used to evaluate the performance of the DIC (even if different purposes) ● ●

smallest DIC value second smallest DIC value

sce. 1

strong correlation significantly non−null



U T V ε

0.65 0.60

●●●●

0.55

●●

Spearman's coefficient

0.50 0.45 0.40







0.35





0.30





0.25 0.20



0.15



0.10









● ●

● ●

0.05 0.00





∗ Models with good DIC values show highly variable ρ’s. Ioana Molnar

DIC and other model selection criteria in risk mapping

10 / 15

The DIC vs. Spearman’s ρ (simulated data) Spearman’s rho: _ used to evaluate the performance of the DIC (even if different purposes) ● ●

smallest DIC value second smallest DIC value

0.75



0.70







U T V ε



0.65



0.60

Spearman's coefficient

sce. 2

strong correlation significantly non−null







0.55



0.50



● ●

0.45 0.40



0.35







0.30 0.25



●●





● ●

●●

0.20 0.15 0.10

● ●

● ●



∗ Models with good DIC values show highly variable ρ’s. Ioana Molnar

DIC and other model selection criteria in risk mapping

10 / 15

Proposition of new selection methods • Criteria coupling: C(C1 , C2 ) _ Filters out models that are “bad” according to C1 , and selects one good model according to C2 _ Order matters _ Easy to apply, but needs some post-hoc analysis

• A new deviance criterion: Smoothness Deviance Criterion (SDC)

_ Replaces the penalization by the complexity measure pD with a penalization for the lack of smoothness   1 SDC = ( D(θ) + ξ ) 1 + , I · 1[0,1] where ξ > 0 s.t. D(θ) + ξ > 0 for all compared models. _ Multiplicative penalization =⇒ the hierarchy defined by the goodness of fit measure D(θ) is modified only if the models have very poor smoothness coefficient _ The parsimony idea is contained in I _ Easy to implement Ioana Molnar

DIC and other model selection criteria in risk mapping

11 / 15

Application of the new methods on the simulated data For both scenarios, the model selected via the coupling C(Moran’s I, DIC) and that selected via the SDC are the same. • For scenario 1 (data generated by BN): model BN, V. 2001

2002

2003

2004

2005

2007

2008

2009

2010

4480.00

1.00 2006

• For scenario 2 (data generated by P): model P, U. 2001

2002

2003

2004

2005

2007

2008

2009

2010

4480.00

1.00 2006

Ioana Molnar

DIC and other model selection criteria in risk mapping

12 / 15

The DIC for the real data Boxplots of the values of the DIC:











4000











● ● ●









● ●

● ●



3100

2000

3200

P models ●

● ● ● ●



0

● ● ● ● ●

3000

● ●



● ● ●

● ●

−2000

● ●

BN models



_ Smallest DIC values obtained for several models (all of them P models). Example: (P, U, T , V). 2001

2002

2003

2004

2005

2007

2008

2009

2010

852.38

0.04 2006

Ioana Molnar

DIC and other model selection criteria in risk mapping

13 / 15



Application of the new methods on the real data The models selected via the coupling C(DIC, Moran’s I), and that selected via the SDIC are different, but very similar. • C(DIC, Moran’s I): model (BN, U, V). 2001

2002

2003

2004

2005

2007

2008

2009

2010

2002

2003

2004

2005

2007

2008

2009

2010

186.60

0.10 2006

• SDC: model (BN, U). 2001

177.69

0.11 2006

Ioana Molnar

DIC and other model selection criteria in risk mapping

14 / 15

Conclusion and perspectives Summary: • Weaknesses of the DIC specific to disease mapping: • lack of smoothness of the risk maps • identical ranking of models producing very different risk maps • too much divergence between the DIC and the accuracy measures • The systematic application of the DIC in disease mapping is not justified. • The modification of the DIC to take into account the smoothness (crucial point in risk mapping) through Criteria Coupling and SDC shows promising results. Some work to follow: • confirm the utility of the proposed models on different data sets and simulation scenarios • investigate the properties of the proposed method (robustness)

Ioana Molnar

DIC and other model selection criteria in risk mapping

15 / 15

Conclusion and perspectives Summary: • Weaknesses of the DIC specific to disease mapping: • lack of smoothness of the risk maps • identical ranking of models producing very different risk maps • too much divergence between the DIC and the accuracy measures • The systematic application of the DIC in disease mapping is not justified. • The modification of the DIC to take into account the smoothness (crucial point in risk mapping) through Criteria Coupling and SDC shows promising results. Some work to follow: • confirm the utility of the proposed models on different data sets and simulation scenarios • investigate the properties of the proposed method (robustness)

Thank you! Ioana Molnar

DIC and other model selection criteria in risk mapping

15 / 15