Signal and Image processing in Remote Sensing ... - Mathieu Fauvel
p(y ) is usually approximated as the uniform distribution, i.e., p(y ) = 1/C or as the proportion of each class in the training set p(y ) = n /n. â Using the logarithm ...
>>> Maximum A Posteriori ? Classification MAP: assign the pixel to the class with the highest probability. yˆ = mx p(y |x) =1,...,C
? Bayes rule:
p(y |x) = p(x|y )p(y )/ p(x)
? The decision rule becomes:
mx p(x|y )p(y )
y=1,...,C
? Gaussian model is conventionally used: p(x|y ) = (2π)−d/ 2 | |−1/ 2 exp(−0.5(x − μ )t −1 (x − μ )) ? p(y ) is usually approximated as the uniform distribution, i.e., p(y ) = 1/ C or as the proportion of each class in the training set p(y ) = n / n. ? Using the logarithm function and multiply by −2 the decision rule is : k (x) = (x − μ )t −1 (x − μ ) + ln(| |) − 2 ln(p(y ))
[~]$ _
[3/7]
>>> Estimation of the parameters ? Maximization of the log-likelihood: = −2 ln(L) ∝ n ln det() +
n X
(x − μ)t −1 (x − μ)
=1
? Derivate w.r.t μ: ∂ ∂μ
∝
n X
ˆ= −1 (x − μ) ⇒ μ
=1
n 1X
n
x
=1
? Derivate w.r.t : ∂ ∂
∝ n−1 −
n X
−1 (x − μ)(x − μ)t −1
=1
ˆ= ⇒
nc 1 X
n
ˆ ˆ t (x − μ)(x − μ)
=1 y =c
Detail of the matrix derivatives can be found in the matrix cookbook http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=3274 [~]$ _
[4/7]
>>> Covariance matrix inversion
? Number of parameters to estimate for a d multidimensional Gaussian distribution. For the mean: d parameters. For the covariance matrix d(d + 1)/ 2. So d(d + 3)/ 2. σ11 σ12 ... σ1d . . σ22 ... σ2d . = .. . σdd ? Orthogonal matrix:
QQt = , hence ()−1 = QΛQt
−1
= QΛ−1 Qt =
d X =1
[~]$ _
q qt
1 λ
[5/7]
>>> Tikhonov regularization ˆ = with ˆ + ϵ ⇒ A + ϵ0 (smothness condition) ? Problem: find A such as A ? Minimization problem with penalization of non-smooth solutions:
ˆ − 2 + kAk2 ˆ = min ƒ (A) with ƒ (A) = A A A
? Computing the derivative of ƒ w.r.t. ∂ƒ ∂A
A:
ˆ t A ˆ − ˆ + t A ∝
? At the optimal, the derivative vanishes: −1 ˆ= ˆ t ˆ + t ˆ A ? Tikhonov :
? Ridge :
= α
−1 ˆ= ˆ 2 + α2 ˆ A
ˆ 1/ 2 = α ˆ= ˆ + α2 A
[~]$ _
−1
[6/7]
>>> Tikhonov regularization ? Tikhonov: ? Ridge:
λ˜−1 =
= λ˜−1
λ λ2
+ α2
1 λ + α 2
λ˜−1 2
1
0
[~]$ _
λ 0
1
2
3
4
5 [7/7]
>>> Support Vectors Machines n ? Supervised method: S = (x , y ) =1 , x ∈ Rd and y ∈ {−1, 1} h(z) = sign ƒ (z)
with ƒ (z) =
n X
α k(z, x ) + b
=1
? Hyperparameters
? kƒ k2 =
{α }n=1 , b
learn by solving: n X 1 L y , ƒ (x ) min kƒ k2 + α,b C =1