Parsimonious Gaussian process models for the ... - Mathieu Fauvel

Walnut. Oaks. Lime. Hazel. Black Locust. M. Fauvel, DYNAFOR - INRA. DYNAFOR - INRA .... pGP is a Gaussian process φ(x) for which, conditionally to y = c, the.
5MB taille 2 téléchargements 297 vues
Parsimonious Gaussian process models for the spectral-spatial classification of hyperspectral remote sensing images Seminar MIAT

M. Fauvel 1 , C. Bouveyron

2

2

and S. Girard

3

1 UMR 1201 DYNAFOR INRA & Institut National Polytechnique de Toulouse Laboratoire MAP5, UMR CNRS 8145, Université Paris Descartes & Sorbonne Paris Cité 3 Equipe MISTIS, INRIA Grenoble Rhône-Alpes & LJK

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Outline Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 2 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 3 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Remote Sensing

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 4 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Remote Sensing

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporel process

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Remote Sensing

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporel process

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Remote Sensing

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporel process

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 5 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Remote Sensing

Nature of remote sensing images A remote sensing image is a sampling of a spatial, spectral and temporel process

1 0.8 0.6 0.4 0.2 0 500 600 700 800 900

t nu

Ja

ary

br Fe

u

ary

rch Ma

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

ril Ap

y Ma

Ju

ne

Ju

lly

st m gu pte Au Se

r be

er

r be

er b tob vem cem Oc De No

DYNAFOR - INRA 5 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Classification of hyperspectral imagery

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 6 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Classification of hyperspectral imagery

Hyperspectral Imagery 1/3 1 0.8 0.6 0.4 0.2

45 0 50 0 55 0 60 0 65 0 70 0 75 0 80 0 85 0 90 0 95 0

0

Pixels are represented by random vector x ∈ Rd with d large, associated to a random variable x that represents the class/label. Classification: predict the membership y of x, y = f (x).

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 7 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Classification of hyperspectral imagery

Hyperspectral Imagery 2/3

Instrument

Range (nm)

# Bands

Bandwidth (nm)

Spatial resolution (m)

AVIRIS HYDICE ROSIS-03 Hyspec HyMAP CASI HYPERION

400-2500 400-2500 400-900 400-2500 400-2500 380-1050 400-2500

224 210 115 427 126 288 200

10 10 4 3 10-20 2.4 10

20/1-4 1-4 1 1 5 1-2 30

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 8 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Classification of hyperspectral imagery

Hyperspectral Imagery 3/3 Definition of more classes with finer resolution: Ash Mapple Birch Walnut Oaks Lime Hazel Black Locust

15,000

10,000

5,000

0 400

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

500

600

700

800

900

DYNAFOR - INRA 9 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Classification of hyperspectral imagery

Image classification in high dimensional space High number of measurements but limited number of training samples. Curse of dimensionality: Statistical, geometrical and computational issues. Conventional method failed [Jimenez and Landgrebe, 1998]. Kernel methods have shown great potential in many situations. Pixelwise classification not adapted [Fauvel et al., 2013].

Need to incorporate spatial information in the classification process: additional complexity. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 10 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Spatial-spectral classification

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 11 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Spatial-spectral classification

Kernel methods VS Parametric methods 1. Kernel methods [Camps-Valls and Bruzzone, 2009]: I

Good abilities for classification,

I

Spatial information included through kernel function or additional features. ks (xi , xj ) =

X

k(xm , xn )

m∼i n∼j

2. Parametric methods [Solberg et al., 1996]: I

Markov Random Field: able to model spatial relationship between pixels,

I

Problem of the estimation of the spectral energy term.

3. Parametric kernel methods: probabilistic models in the kernel feature space. I

Allow to get probability membership, with robust classifier

I

Allow to use the MRF modelization

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 12 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Spatial-spectral classification

Kernel methods and MRF Maximum a posteriori: maxY (Y |X) When Y is MRF: P(Y Pn |X) ∝ exp(−U (Y |X)) where U (Y |X) = i=1 U (yi |xi , Ni ) with U (yi |xi , Ni ) = Ω(xi , yi ) + ρ E(yi , Ni ) Spectral term: − log[p(xi |yi )] I I

SVM outputs [Farag et al., 2005, Tarabalka et al., 2010, Moser and Serpico, 2013] Kernel-probabilistic model [Dundar and Landgrebe, 2004]

Spatial term I

Potts model: E(yi , Ni ) =

P j∈Ni

[1 − δ(yi , yj )]

y1 y2 y3 y4 yi y5 y6 y7 y8

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 13 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Spatial-spectral classification

(Kernel) Gaussian mixture models

Quadratic decision rule in the input space Dc (xi ) = (xi − µc )> Σ−1 c (xi − µc ) + log(det(Σc )) − 2 ln(πc ) Quadratic decision rule in the feature space [Dundar and Landgrebe, 2004]: ¯ Dc φ(xi ) = φ¯c (xi )> K−1 c φc (xi ) + log(det(Kc )) − 2 ln(πc )



Problem: K is badly conditioned (and non-invertible). Unlike SVM, there is no regularization for K−1 and log(det(Kc )) in the estimation c process. So it needs to be included in the model.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 14 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Spatial-spectral classification

(Kernel) Gaussian mixture models

Quadratic decision rule in the input space Dc (xi ) = (xi − µc )> Σ−1 c (xi − µc ) + log(det(Σc )) − 2 ln(πc ) Quadratic decision rule in the feature space [Dundar and Landgrebe, 2004]: ¯ Dc φ(xi ) = φ¯c (xi )> K−1 c φc (xi ) + log(det(Kc )) − 2 ln(πc )



Problem: K is badly conditioned (and non-invertible). Unlike SVM, there is no regularization for K−1 and log(det(Kc )) in the estimation c process. So it needs to be included in the model. Enforce parsimony in the model

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 14 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 15 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Gaussian process in the feature space

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 16 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Gaussian process in the feature space

Kernel induced feature space

φ

Gaussian kernel: k(xi , xj ) = exp −γkxi − xj k2Rd



From Mercer theorem: k(xi , xj ) = hφ(xi ), φ(xj )iF which can be written k(xi , xj ) =

dF X

λm qm (xi )qm (xj )

m=1

where dF = dim(F). √ φ : x 7→ [. . . , λm qm (x), . . .], m = 1, 2, . . . , dF For the Gaussian kernel, dF = +∞ M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 17 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Gaussian process in the feature space

Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The projection of φ(x) on the eigenfunction qcj is noted φ(x)j :

Z hφ(x), qcj i =

φ(x)(t)qcj (t)dt. J

The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules:



Dc φ(xi ) =

" r X hφ(xi ) − µc , qcj i2 j=1

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

λcj

# + ln(λcj ) − 2 ln(πc )

DYNAFOR - INRA 18 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Gaussian process in the feature space

Gaussian process Let us assume that φ(x), conditionally on y = c, is a Gaussian process with mean µc and covariance function Σc . The projection of φ(x) on the eigenfunction qcj is noted φ(x)j :

Z hφ(x), qcj i =

φ(x)(t)qcj (t)dt. J

The random vector [φ(x)1 , . . . , φ(x)r ] ∈ Rr is, conditionally on y = c, a multivariate normal vector. Gaussian mixture model (Quadratic Discriminant) decision rules: rc = min(nc , r) Dc (φ(xi )) =

" rc X hφ(xi ) − µc , qcj i2 j=1 r

+

X j=rc +1

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

λcj

"

# + ln(λcj ) − 2 ln(πc )

hφ(xi ) − µc , qcj i2 + ln(λcj ) λcj

#

DYNAFOR - INRA 18 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Parsimonious Gaussian process

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 19 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Parsimonious Gaussian process

Definitions Definition (Parsimonious Gaussian process with common noise) pGP is a Gaussian process φ(x) for which, conditionally to y = c, the eigen-decomposition of its covariance operator Σc is such that A1. It exists a dimension r < +∞ such that λcj = 0 for j ≥ r and for all c = 1, . . . , C . A2. It exists a dimension pc < min(r, nc ) such that λcj = λ for pc < j < r and for all c = 1, . . . , C .

Definition (Parsimonious Gaussian process with class specific noise) A3. It exists a dimension rc < r such that λcj = 0 for all j > rc and for all c = 1, . . . , C . When r = +∞, it is assumed that rc = nc − 1. A4. It exists a dimension pc < rc such that λcj = λc for j > pc and j ≤ rc , and for all c = 1, . . . , C . A1 and A3 are motivated by the quick decay of the eigenvalues of Gaussian kernels. A2 and A4 express that the data of each class lives in a specific subspace of size pc . M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 20 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Parsimonious Gaussian process

pGP models: List of sub-models Model

Variance inside Fc

qcj

pc

Free Free Free Free Free Free Free

Free Common Free Common Common Free Common

Free Free Free Free Free

Free Common Free Common Common

Variance outside Fc : Common pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6

Free Free Common within groups Common within groups Common between groups Common within and between groups Common within and between groups Variance outside Fc : Free

npGP 0 npGP 1 npGP 2 npGP 3 npGP 4

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

Free Free Common within groups Common within groups Common between groups

DYNAFOR - INRA 21 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Parsimonious Gaussian process

F1

λ1

λ11 λ2

λ12 λ21

F2

λ22

Figure: Visual illustration of model npGP 1 . Dimension of Fc is common to both classes, they have specific variance inside Fc and they have specific noise level.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 22 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Parsimonious Gaussian process

Decision rules for pGP 0 Proposition For pGP 0 , the decision rule can be written: Dc φ(xi )



=

pc X λ − λcj

λcj λ

j=1 pc

+

X

hφ(xi ) − µc , qcj i2 − 2 ln(πc ) +

kφ(x) − µc k2 λ

ln(λcj ) + (pM − pc ) ln(λ) + γ

j=1

where γ is a constant term that does not depend on the index c of the class.

Proofs are given in [Bouveyron et al., 2014].

Ppc

Decompose the sum: Use the property: M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

j=1

Pr j=1

λcj +

Pr j=pc +1

λ

hφ(x) − µc , qcj i2 = kφ(x) − µc k2 DYNAFOR - INRA 23 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 24 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Estimation of the parameters Centered Gaussian kernel function according to class c: nc nc  1 X 1 X k¯c (xi , xj ) = k(xi , xj ) + 2 k(xl , xl 0 ) − k(xi , xl ) + k(xj , xl ) . nc 0 nc l=1 yl =c

l,l =1 yl ,yl0 =c

and Kc of size nc × nc : (Kc )l,l 0 =

k¯c (xl , xl 0 ) . nc

ˆ cj is the j th largest eigenvalue of Kc , and β is its associated normalized λ cj eigenvector. ˆ= P λ C c=1

1 π ˆc (rc − p ˆc )

PC c=1

π ˆ trace(Kc ) −

Pˆpc ˆ  λcj . j=1

π ˆc = nc /n. p ˆc : percentage of cumulative variance. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 25 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Computable decision rule Proposition The decision rule can be computed as:



Dc φ(xi ) =

ˆ pc ˆ−λ ˆ cj 1 Xλ 2 ˆ λ ˆ nc λ cj j=1

X nc

βcjl k¯c (xi , xl )

2

l=1 yl =c

ˆ pc

+

k¯c (xi , xi ) X ˆ ˆ − 2 ln(ˆ + pM − p ˆc ) ln(λ) πc ) ln(λcj ) + (ˆ ˆ λ j=1

Proofs are given in [Bouveyron et al., 2014]. Use of the property that the eigenfunction of the covariance function is a linear combination of φ(xi ) − µc hφ(xi ) − µc , φ(xj ) − µc i = k¯c (xi , xj ) M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 26 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Numerical considerations The proposed model allow a safe computation of K−1 and log det(Kc ) that c appears in the kernel quadratic decision rule.



Only the pc first eigenvector/eigenvalue are used Eigenvectors corresponding to small eigenvalues are not used ˆ is stable. If pc s are not too large, log(λ)

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 27 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Numerical considerations The proposed model allow a safe computation of K−1 and log det(Kc ) that c appears in the kernel quadratic decision rule.



Only the pc first eigenvector/eigenvalue are used Eigenvectors corresponding to small eigenvalues are not used ˆ is stable. If pc s are not too large, log(λ) Proof: Kc is pdf so it can be decomposed into Qc Λc Q> c = K−1 c

=

> Qc Λ−1 c Qc =

r X

> λ−1 cj qcj qcj =

j=1

=

pc X

> λ−1 cj qcj qcj

j=1

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

pc X

Pr

> −1 λ−1 cj qcj qcj + λ

j=1 −1



Inc −

pc X j=1

j=1

r X

> qcj qcj

j=pc +1

! > qcj qcj

> λcj qcj qcj

=

pc X λ − λcj j=1

λλcj

> qcj qcj + λ−1 Inc

DYNAFOR - INRA 27 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Model inference

Numerical considerations The proposed model allow a safe computation of K−1 and log det(Kc ) that c appears in the kernel quadratic decision rule.



Only the pc first eigenvector/eigenvalue are used Eigenvectors corresponding to small eigenvalues are not used ˆ is stable. If pc s are not too large, log(λ) Proof: Kc is pdf so it can be decomposed into Qc Λc Q> c = K−1 c

=

> Qc Λ−1 c Qc =

r X

> λ−1 cj qcj qcj =

j=1

=

pc X

> λ−1 cj qcj qcj

pc X

Pr

> −1 λ−1 cj qcj qcj + λ

j=1 −1

Inc −



pc X

j=1

j=1



log det(Kc )

=

pc X

j=1

r X

> qcj qcj

j=pc +1

! > qcj qcj

> λcj qcj qcj

=

pc X λ − λcj j=1

λλcj

> qcj qcj + λ−1 Inc

log(λcj ) + (r − pc ) log(λ)

j=1 M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 27 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Link with existing models

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 28 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Link with existing models

Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to pGP 1 .

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 29 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Link with existing models

Existing models [Dundar and Landgrebe, 2004] Equal covariance matrix assumption and ridge regularization. Complexity: O(n 3 ). Similar to pGP 4 with equal eigenvectors. [Pekalska and Haasdonk, 2009] Ridge regularization, per class. Complexity: O(nc3 ). [Xu et al., 2009] The last nc − p − 1 eigenvalues are equal to λcp . Complexity: O(nc3 ). Similar to pGP 1 . 101 Ridge pGP Z. Xu et al. λci

10−1

10−3

10−5

10−7 0

5

10

15

20

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

DYNAFOR - INRA 29 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 30 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Data sets and protocol

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 31 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Data sets and protocol

Data sets University of Pavia: 103 spectral bands, 9 classes and 42,776 referenced pixels. Kennedy Space Center: 224 spectral bands, 13 classes and 4,561 referenced pixels. Heves: 252 spectral bands, 16 classes and 360,953 pixels.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 32 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Data sets and protocol

Protocol [Fauvel et al., 2015] 50 training pixels for each class have been randomly selected from the samples. The remaining set of pixels has been used for validation to compute the correct classification rate. Repeated 20 times. Variables have been scaled between 0 and 1. Competitive methods I

SVM

I

RF

I

Kernel-DA (M. Dundar and D. A. Landgrebe, 2004)

Hyperparameters learn by 5-CV.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 33 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Results

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 34 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Results

Classification accuracy Kappa coefficient

Processing time (s)

University

KSC

Heves

University

KSC

Heves

pGP 0 pGP 1 pGP 2 pGP 3 pGP 4 pGP 5 pGP 6

0.768 0.793 0.617 0.603 0.661 0.567 0.610

0.920 0.922 0.844 0.842 0.870 0.820 0.845

0.664 0.671 0.588 0.594 0.595 0.582 0.583

18 18 18 19 19 18 19

31 33 31 33 34 32 34

148 151 148 152 152 148 152

npGP 0 npGP 1 npGP 2 npGP 3 npGP 4

0.730 0.792 0.599 0.578 0.578

0.911 0.921 0.838 0.817 0.817

0.640 0.677 0.573 0.585 0.585

17 18 18 19 19

31 33 31 33 33

148 151 148 152 152

KDC RF SVM

0.786 0.646 0.799

0.924 0.853 0.928

0.666 0.585 0.658

98 3 10

253 3 28

695 18 171

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 35 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Results

pGPMRF

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 36 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Introduction Remote Sensing Classification of hyperspectral imagery Spatial-spectral classification Parsimonious Gaussian process models Gaussian process in the feature space Parsimonious Gaussian process Model inference Link with existing models Experimentals results Data sets and protocol Results Conclusions and perspectives

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 37 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

Family of parsimonious Gaussian process models. Good performances wrt SVM and KDA Faster computation than previous KDA. (n)pGP 1 perform the best. MRF extension. https://github.com/mfauvel/PGPDA Extension: I

Non numerical data

I

Binary data

I

Unsupervised learning

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 38 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

References I [Bouveyron et al., 2014] Bouveyron, C., Fauvel, M., and Girard, S. (2014). Kernel discriminant analysis and clustering with parsimonious gaussian process models. Statistics and Computing, pages 1–20. [Camps-Valls and Bruzzone, 2009] Camps-Valls, G. and Bruzzone, L., editors (2009). Kernel Methods for Remote Sensing Data Analysis. Wiley. [Dundar and Landgrebe, 2004] Dundar, M. and Landgrebe, D. A. (2004). Toward an optimal supervised classifier for the analysis of hyperspectral data. IEEE Trans. Geoscience and Remote Sensing, 42(1):271–277. [Farag et al., 2005] Farag, A., Mohamed, R., and El-Baz, A. (2005). A unified framework for map estimation in remote sensing image segmentation. IEEE Trans. on Geoscience and Remote Sensing, 43(7):1617–1634. [Fauvel et al., 2015] Fauvel, M., Bouveyron, C., and Girard, S. (2015). Parsimonious gaussian process models for the classification of hyperspectral remote sensing images. Geoscience and Remote Sensing Letters, IEEE, 12(12):2423–2427.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 39 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

References II [Fauvel et al., 2013] Fauvel, M., Tarabalka, Y., Benediktsson, J. A., Chanussot, J., and Tilton, J. (2013). Advances in Spectral-Spatial Classification of Hyperspectral Images. Proceedings of the IEEE, 101(3):652–675. [Jimenez and Landgrebe, 1998] Jimenez, L. and Landgrebe, D. (1998). Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 28(1):39 –54. [Moser and Serpico, 2013] Moser, G. and Serpico, S. (2013). Combining support vector machines and markov random fields in an integrated framework for contextual image classification. IEEE Trans. on Geoscience and Remote Sensing, 51(5):2734–2752. [Pekalska and Haasdonk, 2009] Pekalska, E. and Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell., 31(6):1017–1032. [Solberg et al., 1996] Solberg, A., Taxt, T., and Jain, A. (1996). A markov random field model for classification of multisource satellite imagery. Geoscience and Remote Sensing, IEEE Transactions on, 34(1):100–113. M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 40 of 41

Introduction

Parsimonious Gaussian process models

Experimentals results

Conclusions and perspectives

References III

[Tarabalka et al., 2010] Tarabalka, Y., Fauvel, M., Chanussot, J., and Benediktsson, J. (2010). Svm- and mrf-based method for accurate classification of hyperspectral images. IEEE Geoscience and Remote Sensing Letters, 7(4):736–740. [Xu et al., 2009] Xu, Z., Huang, K., Zhu, J., King, I., and Lyu, M. R. (2009). A novel kernel-based maximum a posteriori classification method. Neural Netw., 22(7):977–987.

M. Fauvel, DYNAFOR - INRA Parsimonious Gaussian process models

DYNAFOR - INRA 41 of 41