hSDM
AMAP – Montpellier – January 2012
hierarchical Species Distribution Models (hSDM)
Ghislain Vieilledent1
Andrew Latimer2
John Silander Jr.3
[1] Cirad BSEF, [2] UC Davis, [3] University of Connecticut
1
hSDM
1
2
2
Introduction Species distribution models Statistical models Improving SDM Spatial autocorrelation Observation errors
3
4
hierarchical SDM Data Three different statistical models Model comparison the hSDM R package
hSDM Introduction
1
2
3
Introduction Species distribution models Statistical models Improving SDM Spatial autocorrelation Observation errors
3
4
hierarchical SDM Data Three different statistical models Model comparison the hSDM R package
hSDM Introduction Species distribution models
Definition
Objectives Identifying the suitable habitat for species persistence Representing this habitat spatially Reference : species niche (Hutchinson 1957)
4
hSDM Introduction Species distribution models
Definition Terminology Species distribution models (SDM) Niche models Habitat suitability models Result of the modelling approach Not the realized nor the fundamental niche Suitable habitat given the environmental factors of the model
5
hSDM Introduction Species distribution models
Applications Conservation biology Detecting unexplored areas for rare species Identifying priority protected areas Ecology Studying impact of climate change on biodiversity Assessing invasive species risk Evolution Paleodistribution Exploring speciation mechanisms 6
Uroplatus sp. (Pearson et al. 2007)
hSDM Introduction Species distribution models
Example
Conservation biology Baobab vulnerability to climate change Adapting the protected area network
7
hSDM Introduction Statistical models
Available algorithms Profile Techniques : BIOCLIM Ecological Niche Factor Analysis (ENFA) Regression-based Techniques : Generalized Linear Model (GLM) Generalized Additive Model (GAM) Multivariate Adaptive Regression Splines (MARS) Machine Learning Techniques : Maximum Entropy (MAXENT) Artifical Neural Networks (ANN) Genetic Algorithm for Rule Set Production (GARP) Random Forest (RF) ...
8
hSDM Introduction Statistical models
Algorithm characteristics
Algorithm BIOCLIM ENFA GLM GARP MAXENT
9
Accept absence data No Background Yes Yes Background
Accept categorical variables No No Yes No Yes
hSDM Introduction Statistical models
SDM can be largely improved
ensemble forecasting spatial autocorrelation biotic interactions observation error : false absence
10
hSDM Introduction Statistical models
SDM can be largely improved
ensemble forecasting spatial autocorrelation biotic interactions observation error : false absence
11
hSDM Improving SDM
1
2
12
Introduction Species distribution models Statistical models Improving SDM Spatial autocorrelation Observation errors
3
4
hierarchical SDM Data Three different statistical models Model comparison the hSDM R package
hSDM Improving SDM Spatial autocorrelation
Importance of spatial autocorrelation
Lichstein et al. 2002, Ecological Monographs : “ignoring space may lead to false conclusions about ecological relationships”
13
hSDM Improving SDM Spatial autocorrelation
Importance of spatial autocorrelation Data
breeding bird survey for 3 species managed forests southern Appalachian Mountains, USA 1177 sample points i (1997–1999) √ counti = Factorsi β + εi
14
Dendroica pensylvanica
hSDM Improving SDM Spatial autocorrelation
Importance of spatial autocorrelation Models
OLS :
√
counti = Yi = Factorsi β + εi , εi ∼ Normal(0, σ 2 )
CAR : εi ∼ MVNormal(0, V ) V = (I − ρW )−1 M ρ : direction and magnitude of the spatial neighborhood effect W : matrix of weight wij = 1/distanceij M : diagonal matrix with variance σ 2 E (Yi |allYj ) = µi + ρ
15
P
j
wij (Yj − µj )
hSDM Improving SDM Spatial autocorrelation
Importance of spatial autocorrelation Results
OLS : non-normal residuals 2 2 RCAR > ROLS
effects of many habitat factors were reduced with CAR
16
hSDM Improving SDM Spatial autocorrelation
Importance of spatial autocorrelation First scientific gap Lichstein : Gaussian model Most of time for SDM : Binomial model with 0 and 1 None of the precited available algorithms for SDM handle spatial autocorrelation
17
hSDM Improving SDM Observation errors
Observation error : absence
Absences False absences : 1. Species present but not detected 2. Suitable habitat but species is no yet/no more present True absences : habitat is actually not suitable Pseudo-absences : we assume that species is absent (can lead to false absences)
18
hSDM Improving SDM Observation errors
Observation error : absence Case 1 : Presence-only data Ex : herbarium data Algorithms using presence-only data : BIOCLIM, ENFA, MAXENT Pseudo-absences + algorithms using presence-absence data : GLM, GAM, GARP Case 2 : Presence-absence data Ex : Phytosociological sampling True-absences are really very informative Algorithms using presence-absence data : GLM, GAM, GARP Risks of false absences
19
hSDM Improving SDM Observation errors
Observation error : absence Second scientific gap For presence-absence data or presence-(pseudo-)absence data, none of the precited algorithms handle the risk of false absences.
20
hSDM hierarchical SDM
1
2
21
Introduction Species distribution models Statistical models Improving SDM Spatial autocorrelation Observation errors
3
4
hierarchical SDM Data Three different statistical models Model comparison the hSDM R package
hSDM hierarchical SDM Data
Latimer et al. 2006
22
hSDM hierarchical SDM Data
Latimer et al. 2006 Data Presence-absence data Explicative variables : temperature, precipitation, elevation, soil fertility, etc. Protea mundii
Protea punctata
23
hSDM hierarchical SDM Three different statistical models
A simple GLM
Yi+ =
P
s∈celli
Y (s)
Yi+ ∼ Binomial(ni , pi ) logit(pi ) = x0i β Likelihood : ni p(Yi+ = y ) = piy (1 − pi )ni −y y Can be fitted with the classical glm() function in R Here, Bayesian estimation with non-informative priors + ARMS
24
hSDM hierarchical SDM Three different statistical models
A model with spatial autocorrelation : model 2
Yi+ ∼ Binomial(ni , pi ) logit(pi ) = x0i β+ρi p(ρi |ρj ) = N
P
j ρj ai+
σ2
, ai+ρ
Hierarchical model : +1 level for spatial random effects Can be fitted with WinBUGS if the number of cells is small Hierarchical Bayesian estimation with non-informative priors + ARMS
25
hSDM hierarchical SDM Three different statistical models
A hierarchical spatially explicit model : model 4 Suitability process Ui : proportion of transformed area in cell i (known) pi : probability that that the habitat is suitable on cell i p(Si = 1) = (1 − Ui )pi logit(pi ) = x0i βP + ρi σ2 j ρj p(ρi |ρj ) = N ai+ , ai+ρ Likelihood : y > 0 : p(Yi+ y = 0 : p(Yi+ 26
Observability process qi : probability that the species is observed on cell i given that the habitat is suitable. p(Yij = 1|Si = 1) = qi logit(qi ) = wi0 γ
ni = y) = qiy (1 − qi )ni −y (1 − Ui )pi y = 0) = (1 − qi )ni (1 − Ui )pi + [1 − (1 − Ui )pi ]
hSDM hierarchical SDM Model comparison
Parameter significance
Protea mundii 27
hSDM hierarchical SDM Model comparison
Probability of presence
Protea punctata 28
hSDM hierarchical SDM Model comparison
Model performance
Area Under the ROC Curve : larger AUC reflects a better model Minimum Predicted Area : smaller MAP reflects a better model
29
hSDM the hSDM R package
1
2
30
Introduction Species distribution models Statistical models Improving SDM Spatial autocorrelation Observation errors
3
4
hierarchical SDM Data Three different statistical models Model comparison the hSDM R package
hSDM the hSDM R package
hSDM R package
Characteristics Functions to estimate model 1, 2 and 4 Adaptive Rejection Metropolis Sampling (ARMS) Algorithm developped in C code for speed and treatment of large data-sets Availability source code, examples and manual http://ghislain.vieilledent.free.fr/ ?page_id=454
31
hSDM the hSDM R package
. . . Thank you for attention . . .
32