Signal and Image processing in Remote Sensing ... - Mathieu Fauvel

p(y ) is usually approximated as the uniform distribution, i.e., p(y ) = 1/C or as the proportion of each class in the training set p(y ) = n /n. ⋆ Using the logarithm ...
832KB taille 4 téléchargements 302 vues
>>> Signal and Image processing in Remote Sensing >>> Classification Name: Date:

[~]$ _

Mathieu Fauvel (UMR 1201 DYNAFOR INRA - INPT/ENSAT) 2016

[1/7]

>>> Classification of remote sensing images

Original Data [~]$ _

Ground-truth

Thematic Map [2/7]

>>> Maximum A Posteriori ? Classification MAP: assign the pixel to the class with the highest probability. yˆ = mx p(y |x) =1,...,C

? Bayes rule:

p(y |x) = p(x|y )p(y )/ p(x)

? The decision rule becomes:

mx p(x|y )p(y )

y=1,...,C

? Gaussian model is conventionally used: p(x|y ) = (2π)−d/ 2 | |−1/ 2 exp(−0.5(x − μ )t −1  (x − μ )) ? p(y ) is usually approximated as the uniform distribution, i.e., p(y ) = 1/ C or as the proportion of each class in the training set p(y ) = n / n. ? Using the logarithm function and multiply by −2 the decision rule is : k (x) = (x − μ )t −1  (x − μ ) + ln(| |) − 2 ln(p(y ))

[~]$ _

[3/7]

>>> Estimation of the parameters ? Maximization of the log-likelihood:  = −2 ln(L) ∝ n ln det() +

n X

(x − μ)t −1 (x − μ)

=1

? Derivate w.r.t μ: ∂ ∂μ



n X

ˆ= −1 (x − μ) ⇒ μ

=1

n 1X

n

x

=1

? Derivate w.r.t : ∂ ∂

∝ n−1 −

n X

−1 (x − μ)(x − μ)t −1

=1

ˆ= ⇒ 

nc 1 X

n

ˆ ˆ t (x − μ)(x  − μ)

=1 y =c

Detail of the matrix derivatives can be found in the matrix cookbook http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=3274 [~]$ _

[4/7]

>>> Covariance matrix inversion

? Number of parameters to estimate for a d multidimensional Gaussian distribution. For the mean: d parameters. For the covariance matrix d(d + 1)/ 2. So d(d + 3)/ 2.   σ11 σ12 ... σ1d  .   .  σ22 ... σ2d   . =  ..   .   σdd ? Orthogonal matrix:

QQt = , hence ()−1 = QΛQt

−1

= QΛ−1 Qt =

d X =1

[~]$ _

q qt

1 λ

[5/7]

>>> Tikhonov regularization ˆ =  with  ˆ + ϵ ⇒ A + ϵ0 (smothness condition) ? Problem: find A such as A ? Minimization problem with penalization of non-smooth solutions:

ˆ −  2 + kAk2 ˆ = min ƒ (A) with ƒ (A) = A A A

? Computing the derivative of ƒ w.r.t. ∂ƒ ∂A

A:

ˆ t A ˆ − ˆ + t A ∝

? At the optimal, the derivative vanishes:  −1 ˆ=  ˆ t ˆ + t  ˆ A  ? Tikhonov :

? Ridge :

 = α

 −1 ˆ=  ˆ 2 + α2  ˆ A 

ˆ 1/ 2  = α ˆ=  ˆ + α2  A

[~]$ _

−1

[6/7]

>>> Tikhonov regularization ? Tikhonov: ? Ridge:

λ˜−1 = 

= λ˜−1 

λ λ2

+ α2

1 λ + α 2

λ˜−1 2

1

0

[~]$ _

λ 0

1

2

3

4

5 [7/7]

>>> Support Vectors Machines  n ? Supervised method: S = (x , y ) =1 , x ∈ Rd and y ∈ {−1, 1} h(z) = sign ƒ (z)



with ƒ (z) =

n X

α k(z, x ) + b

=1

? Hyperparameters

? kƒ k2 =

{α }n=1 , b



learn by solving:   n X 1  L y , ƒ (x )  min  kƒ k2 + α,b C =1

Pn

α αj k(x , xj ) ,j=1   ? L y , ƒ (x ) = mx 0, 1 − y ƒ (x )

ƒ (x) = 1

L

ƒ (x) = 0

y ƒ (x )

[~]$ _

ƒ (x) = −1

[8/7]

>>> Support Vectors Machines

mx g(α) α

=

ℓ X =1

constraint to

α −

ℓ 1 X

2

α αj y yj K(x , xj )

,j=1

0 ≤ α ≤ C Pℓ αy =0 =1  

Or mx g(α) α

constraint to

[~]$ _

=

αt 1 −

1

 α t K ◦ (yyt ) α

2 0≤α ≤1◦C αt y = 0

[9/7]