Parsimonious Gaussian process models for the spectral-spatial classification of high dimensional remote sensing images STATLEARN 2016
M. Fauvel 1 , C. Bouveyron
2
2
and S. Girard
3
1 UMR 1201 DYNAFOR INRA & Institut National Polytechnique de Toulouse Laboratoire MAP5, UMR CNRS 8145, Université Paris Descartes & Sorbonne Paris Cité 3 Equipe MISTIS, INRIA Grenoble Rhône-Alpes & LJK
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Outline Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 2 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 3 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 4 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.
1 0.8 0.6 0.4 0.2 0 500 600 700 800 900
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 5 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.
1 0.8 0.6 0.4 0.2 0 500 600 700 800 900
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 5 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.
1 0.8 0.6 0.4 0.2 0 500 600 700 800 900
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 5 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporal process.
1 0.8 0.6 0.4 0.2 0 500 600 700 800 900
t nu
Ja
ary
br Fe
u
ary
rch Ma
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
ril Ap
y Ma
Ju
ne
Ju
ly
st m gu pte Au Se
r be
er
r be
er b tob vem cem Oc De No
DYNAFOR - INRA 5 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Hyperspectral High number of spectral measurements and one temporal measurement: x ∈ R180 1
x(λ)
Reflectance
0.8
0.6
0.4
0.2
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
0 95
0 90
0
0
85
Wavelenghts
80
0 75
0 70
0 65
0 60
0 55
50 0
45 0
0
DYNAFOR - INRA 6 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Hypertemporal Image High number of temporal measurements for few spectral measurements: x ∈ R4×46 500
xb (t) xg (t) xr (t) xir (t)
Numerical count
400
300
200
100
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
0
0 34
32
0
0 30
28
0
0
26
0
Day of the year
24
0
22
0
0
0
20
18
16
0
0
14
12
10
80
60
40
0
DYNAFOR - INRA 7 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Some numbers Hyperspectral sensors Instrument
Range (nm)
# Bands
Spatial resolution (m)
AVIRIS ROSIS-03 Hyspec HyMAP CASI
400-2500 400-900 400-2500 400-2500 380-1050
224 115 427 126 288
1-4 1 1 5 1-2
Hypertemporal missions Mission
Revisit time (days)
# Bands
Spatial resolution (m)
Sentinel-2 MODIS Landsat-8
5 1 16
13 7 8
10-20-60 500 30
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 8 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
High dimensional remote sensing images
Thematic applications Land cover/use at the national scale
Monitor ecosystems & biodiversity Disaster management
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 9 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 10 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Image classification in high dimensional space High number of measurements but limited number of training samples: d ≈ n. Curse of dimensionality: Statistical, geometrical and computational issues. Conventional methods failed [Jimenez and Landgrebe, 1998]. Pixelwise classification is not adapted [Fauvel et al., 2013]: invariant to pixels location, sensible to mixels!
Random permutation of pixels location
Spectral classification
Same results!
Spectral classification
Need to incorporate spatial information in the classification process: additional complexity. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 11 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Image classification in high dimensional space High number of measurements but limited number of training samples: d ≈ n. Curse of dimensionality: Statistical, geometrical and computational issues. Conventional methods failed [Jimenez and Landgrebe, 1998]. Pixelwise classification is not adapted [Fauvel et al., 2013]: invariant to pixels location, sensible to mixels!
Need to incorporate spatial information in the classification process: additional complexity. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 11 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Spatial-spectral classification in remote sensing Extract spatial features Using some image processing filters I I I I
Morphological Profile, adaptive neighborhood [Fauvel et al., 2013] Local statistical moments [Camps-Valls et al., 2006] Wavelets [Mercier and Girard-Ardhuin, 2006] Texture, Gabor . . .
Then combine the spectral and the spatial information I I I
Stack vector of all variables or few extracted variables (PCA . . . ) & kernel classifier Fusion of separate classifiers . . . Composite kernels [Camps-Valls et al., 2006]
3 Easy to implement. 3 Easy to plug different spatial features. 5 Additional variables: statistical issues & computational issues. 5 Most of the image processing tools are defined for R1 -valued pixel. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 12 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Spatial-spectral classification in remote sensing
Original image
Top-hat
Local Stand. Deviat.
Gabor features
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 12 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Classification of hyperspectral/hypertemporal images
Spatial-spectral classification in remote sensing Model the spatial dependencies Markov Random Field: local neighborhood is used in the decision p(yi = c|xi , Ni )
yi
yi
First-order
Second order
3 Allow for a fine modelization of inter-pixel dependencies. 5 Estimation of the spectral energy term (through GMM).
Mean kernel [Gurram and Kwon, 2013] K (xi , xj ) = γ −2
P m∈Ni ,n∈Nj
k(xm , xn )
3 Good performances in terms of classification accuracy. 5 Rough modelization of the inter-pixel dependencies. 5 Computing time. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 12 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parametric kernel method for spectral-spatial classification
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 13 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parametric kernel method for spectral-spatial classification
SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I
SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]
I
Kernel-probabilistic model [Dundar and Landgrebe, 2004]
Spatial term I
Potts model: E(yi , Ni ) =
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
P j∈Ni
[1 − δ(yi , yj )]
DYNAFOR - INRA 14 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parametric kernel method for spectral-spatial classification
SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I
SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]
I
Kernel-probabilistic model [Dundar and Landgrebe, 2004]
Spatial term I
Potts model: E(yi , Ni ) =
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
P j∈Ni
[1 − δ(yi , yj )]
DYNAFOR - INRA 14 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parametric kernel method for spectral-spatial classification
SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I
SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]
I
Kernel-probabilistic model [Dundar and Landgrebe, 2004]
Spatial term I
Potts model: E(yi , Ni ) =
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
P j∈Ni
[1 − δ(yi , yj )]
DYNAFOR - INRA 14 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parametric kernel method for spectral-spatial classification
SVM & MRF Maximum a posteriori: maxY P(Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I
SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013]
I
Kernel-probabilistic model [Dundar and Landgrebe, 2004]
Spatial term I
Potts model: E(yi , Ni ) =
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
P j∈Ni
[1 − δ(yi , yj )]
DYNAFOR - INRA 14 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 15 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Gaussian process in the feature space
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 16 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Gaussian process in the feature space
Kernel induced feature space R2
φ
F
From Mercer theorem: k(xi , xj ) = hφ(xi ), φ(xj )iF
Gaussian kernel: k(xi , xj ) = exp −γkxi − xj k2Rd with dF = +∞ Build a probabilistic model in F is not directly possible:
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 17 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Gaussian process in the feature space
Kernel induced feature space R2
φ
F
From Mercer theorem: k(xi , xj ) = hφ(xi ), φ(xj )iF
Gaussian kernel: k(xi , xj ) = exp −γkxi − xj k2Rd with dF = +∞ Build a probabilistic model in F is not directly possible: Work with subspace models
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 17 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Gaussian process in the feature space
Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules:
Dc φ(xi ) =
" r X hφ(xi ) − µc , qcj i2 j=1
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
λcj
# + ln(λcj ) − 2 ln(πc )
DYNAFOR - INRA 18 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Gaussian process in the feature space
Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules: rc = min(nc , r) Dc (φ(xi )) =
" rc X hφ(xi ) − µc , qcj i2 j=1 r
+
X j=rc +1
λcj
"
# + ln(λcj ) − 2 ln(πc )
hφ(xi ) − µc , qcj i2 + ln(λcj ) λcj
#
We propose to enforce parsimony in the model to make them computable.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 18 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parsimonious Gaussian process
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 19 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parsimonious Gaussian process
Definitions Definition (Parsimonious Gaussian process with common noise) pGP is a Gaussian process φ(x) for which, conditionally to y = c, the eigen-decomposition of its covariance operator Σc is such that A1. It exists a dimension r < +∞ such that λcj = 0 for j ≥ r and for all c = 1, . . . , C . A2. It exists a dimension pc < min(r, nc ) such that λcj = λ for pc < j < r and for all c = 1, . . . , C . A1 is motivated by the quick decay of the eigenvalues of Gaussian kernels [Braun et al., 2008]. A2 expresses that the data of each class lives in a specific subspace of size pc . Refers in the following to as pGP 0
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 20 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parsimonious Gaussian process
F1
λ
λ11 λ
λ12 λ21
F2
λ22
Figure: Visual illustration of parsimonious model. Dimension of Fc is common to both classes, they have specific variance inside Fc and they have common noise level.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 21 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parsimonious Gaussian process
pGP models: List of sub-models Model
Variance inside Fc
qcj
pc
Free Free Free Free Free Free Free
Free Common Free Common Common Free Common
Free Free Free Free Free
Free Common Free Common Common
Variance outside Fc : Common pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6
Free Free Common within groups Common within groups Common between groups Common within and between groups Common within and between groups Variance outside Fc : Free
npGP 0 npGP 1 npGP 2 npGP 3 npGP 4
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
Free Free Common within groups Common within groups Common between groups
DYNAFOR - INRA 22 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Parsimonious Gaussian process
Decision rules for pGP 0 Proposition For pGP 0 , the decision rule can be written:
Dc φ(xi )
=
pc X λ − λcj j=1 pc
+
λcj λ
X
hφ(xi ) − µc , qcj i2 − 2 ln(πc ) +
kφ(x) − µc k2 λ
ln(λcj ) + (pM − pc ) ln(λ) + γ
j=1
where γ is a constant term that does not depend on the index c of the class.
µc is the mean function of GP conditionally to y = c. (λcj ,qcj ) are the first eigenvalues/eigenfunctions of the covariance function of GP conditionally to y = c. pc is the size of Fc Proofs are given in [Bouveyron et al., 2014]. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 23 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Model inference
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 24 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Model inference
Estimation of the parameters Centered Gaussian kernel function according to class c: nc nc 1 X 1 X k¯c (xi , xj ) = k(xi , xj ) + 2 k(xl , xl 0 ) − k(xi , xl ) + k(xj , xl ) . nc 0 nc l=1 yl =c
l,l =1 yl ,yl0 =c
and Kc of size nc × nc : (Kc )l,l 0 =
k¯c (xl , xl 0 ) . nc
ˆ cj is the j th largest eigenvalue of Kc , and β is its associated normalized λ cj eigenvector. ˆ= P λ C c=1
1 π ˆc (rc − p ˆc )
PC c=1
π ˆ trace(Kc ) −
Pˆpc ˆ λcj . j=1
π ˆc = nc /n. p ˆc : percentage of cumulative variance. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 25 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Model inference
Computable decision rule Proposition The decision rule can be computed as:
Dc φ(xi ) =
ˆ pc ˆ−λ ˆ cj 1 Xλ 2 ˆ ˆ nc λcj λ j=1
X nc
βcjl k¯c (xi , xl )
2
l=1 yl =c
ˆ pc
+
k¯c (xi , xi ) X ˆ ˆ − 2 ln(ˆ + ln(λcj ) + (ˆ pM − p ˆc ) ln(λ) πc ) ˆ λ j=1
Use of the property that the eigenfunction of the covariance function is a linear combination of φ(xi ) − µc hφ(xi ) − µc , φ(xj ) − µc i = k¯c (xi , xj )
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 26 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Model inference
Estimation of the hyperparameters
The parsimonious Gaussian process model and two hyperparameters have to be fitted: I
(n)pGP 0...6
I
The kernel parameter γ
I
The size of signal subspace pc
Done by k-fold cross validation. Given one kernel hyperparameter value, all the others pc s values can be tested at not cost since model parameters are derived from the eigendecomposition of Kc .
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 27 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Link with existing models
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 28 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Link with existing models
Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to npGP 1 .
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 29 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Link with existing models
Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to npGP 1 . 101 Ridge pGP Z. Xu et al. λci
10−1
5 They all use eigenvectors associated to (very) small eigenvalues. 3 pGP models use only eigenvectors associated to the largest eigenvalues.
10−3
10−5
10−7 0 M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
20
40
60
80
100
DYNAFOR - INRA 29 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 30 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Protocol [Fauvel et al., 2015] 50 training pixels for each class have been randomly selected from the samples. The remaining set of pixels has been used for validation to compute the correct classification rate. Repeated 20 times. Variables have been scaled between 0 and 1. Competitive methods I
SVM
I
RF
I
Kernel-DA [Dundar and Landgrebe, 2004].
Hyperparameters learn by 5-cv. MRF: Metropolis-Hasting M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 31 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hyperspectral data
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 32 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hyperspectral data
data sets university of pavia: uint16, 103 spectral bands, 9 classes and 42,776 referenced pixels. kennedy space center: uint16, 224 spectral bands, 13 classes and 4,561 referenced pixels. heves: uint16, 252 spectral bands, 16 classes and 360,953 pixels.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 33 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hyperspectral data
Samples 12000
12000
10000
10000
8000
8000
6000
6000
4000
4000
2000
2000
0
50
100
150
200
250
0
12000
12000
10000
10000
8000
8000
6000
6000
4000
4000
2000
2000
0
50
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
100
150
200
250
0
50
100
150
200
250
50
100
150
200
250 DYNAFOR - INRA 34 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hyperspectral data
Classification accuracy and processing time Kappa coefficient
Processing time (s)
University
KSC
Heves
University
KSC
Heves
pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6
0.768 0.793 0.617 0.603 0.661 0.567 0.610
0.920 0.922 0.844 0.842 0.870 0.820 0.845
0.664 0.671 0.588 0.594 0.595 0.582 0.583
18 18 18 19 19 18 19
31 33 31 33 34 32 34
148 151 148 152 152 148 152
npGP 0 npGP 1 npGP 2 npGP 3 npGP 4
0.730 0.792 0.599 0.578 0.578
0.911 0.921 0.838 0.817 0.817
0.640 0.677 0.573 0.585 0.585
17 18 18 19 19
31 33 31 33 33
148 151 148 152 152
KDC RF SVM
0.786 0.646 0.799
0.924 0.853 0.928
0.666 0.585 0.658
98 3 10
253 3 28
695 18 171
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 35 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hyperspectral data
pGPMRF
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 36 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 37 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
Data set Formosat-2 SITS: uint8, 4 spectral bands, 43 dates for 2006 and 13 woody classes.
t1
t2
t3
t4
t5
t6
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 38 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
Samples 500
500
400
400
300
300
200
200
100
100
0
20
40
60
80
100
120
140
160
0
500
500
400
400
300
300
200
200
100
100
0
20
40
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
60
80
100
120
140
160
0
20
40
60
80
100
120
140
160
20
40
60
80
100
120
140
160 DYNAFOR - INRA 39 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
Classification accuracy and processing time Kappa Coefficient
Processing time (s)
pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6
0.950 0.955 0.887 0.887 0.932 0.846 0.891
13 9 13 9 9 13 8
npGP 0 npGP 1 npGP 2 npGP 3 npGP 4
0.941 0.943 0.883 0.871 0.871
12 8 13 9 9
KDA RF SVM
0.942 0.896 0.944
69 1 10
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 40 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
pGPMRF
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 41 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
pGPMRF
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 41 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Hypertemporal data
pGPMRF
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 41 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Introduction High dimensional remote sensing images Classification of hyperspectral/hypertemporal images Parametric kernel method for spectral-spatial classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimental results Hyperspectral data Hypertemporal data Conclusions and perspectives
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 42 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
Family of parsimonious Gaussian process models has been presented. Good performances w.r.t SVM and KDA. Faster computation than previous KDA. (n)pGP 1 perform the best. MRF extension. https://github.com/mfauvel/PGPDA Extension: I
Non numerical data [Bouveyron et al., 2014]
I
Binary data [Sylla et al., 2015]
I
Unsupervised learning [Bouveyron et al., 2014]
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 43 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
References I [Bouveyron et al., 2014] Bouveyron, C., Fauvel, M., and Girard, S. (2014). Kernel discriminant analysis and clustering with parsimonious gaussian process models. Statistics and Computing, pages 1–20. [Braun et al., 2008] Braun, M. L., Buhmann, J. M., and Muller, K.-R. (2008). On Relevant Dimensions in Kernel Feature Spaces. Journal of Machine Learning Research, 9:1875–1908. [Camps-Valls et al., 2006] Camps-Valls, G., Gomez-Chova, L., Muñoz-Marí, J., Vila-Francés, J., and Calpe-Maravilla, J. (2006). Composite kernels for hyperspectral image classification. Geoscience and Remote Sensing Letters, IEEE, 3(1):93–97. [Dundar and Landgrebe, 2004] Dundar, M. and Landgrebe, D. A. (2004). Toward an optimal supervised classifier for the analysis of hyperspectral data. IEEE Trans. Geoscience and Remote Sensing, 42(1):271–277. [Farag et al., 2005] Farag, A., Mohamed, R., and El-Baz, A. (2005). A unified framework for map estimation in remote sensing image segmentation. IEEE Trans. on Geoscience and Remote Sensing, 43(7):1617–1634.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 44 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
References II [Fauvel et al., 2015] Fauvel, M., Bouveyron, C., and Girard, S. (2015). Parsimonious gaussian process models for the classification of hyperspectral remote sensing images. Geoscience and Remote Sensing Letters, IEEE, 12(12):2423–2427. [Fauvel et al., 2013] Fauvel, M., Tarabalka, Y., Benediktsson, J. A., Chanussot, J., and Tilton, J. (2013). Advances in Spectral-Spatial Classification of Hyperspectral Images. Proceedings of the IEEE, 101(3):652–675. [Gurram and Kwon, 2013] Gurram, P. and Kwon, H. (2013). Contextual svm using hilbert space embedding for hyperspectral classification. Geoscience and Remote Sensing Letters, IEEE, 10(5):1031–1035. [Jimenez and Landgrebe, 1998] Jimenez, L. and Landgrebe, D. (1998). Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 28(1):39 –54. [Mercier and Girard-Ardhuin, 2006] Mercier, G. and Girard-Ardhuin, F. (2006). Partially supervised oil-slick detection by sar imagery using kernel expansion. Geoscience and Remote Sensing, IEEE Transactions on, 44(10):2839–2846.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 45 of 46
Introduction
Parsimonious Gaussian process models
Experimental results
Conclusions and perspectives
References III [Moser and Serpico, 2013] Moser, G. and Serpico, S. (2013). Combining support vector machines and markov random fields in an integrated framework for contextual image classification. IEEE Trans. on Geoscience and Remote Sensing, 51(5):2734–2752. [Pekalska and Haasdonk, 2009] Pekalska, E. and Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell., 31(6):1017–1032. [Sylla et al., 2015] Sylla, S. N., Girard, S., Diongue, A. K., Diallo, A., and Sokhna, C. (2015). A classification method for binary predictors combining similarity measures and mixture models. Dependence Modeling, 3:240–255. [Tarabalka et al., 2010] Tarabalka, Y., Fauvel, M., Chanussot, J., and Benediktsson, J. (2010). SVM- and MRF-based method for accurate classification of hyperspectral images. IEEE Geoscience and Remote Sensing Letters, 7(4):736–740. [Xu et al., 2009] Xu, Z., Huang, K., Zhu, J., King, I., and Lyu, M. R. (2009). A novel kernel-based maximum a posteriori classification method. Neural Netw., 22(7):977–987.
M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models
DYNAFOR - INRA 46 of 46