International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering
DECOMPOSITION OF A CHEMICAL SPECTRUM USING A MARKED POINT PROCESS AND A CONSTANT DIMENSION MODEL
V. Mazet, D. Brie, J. Idier
CRAN UMR 7039, Nancy University, CNRS,
´ IRCCyN UMR 6597, Ecole Centrale de Nantes, CNRS,
BP 239, 54506 Vandœuvre-l`es-Nancy Cedex, France
[email protected]
1 rue de la No¨e, BP 92101, 44321 Nantes Cedex 3, France
[email protected]
intensity (arbitrary unit)
Introduction
3000 2000 1000 0 −1000 200
400
600
800
1000
1200
wavenumber (cm−1)
Goal: estimating the peak parameters (locations, amplitudes and widths) in a spectrum. ➜ Provide an interpretation for physico-chemists. ➜ Bayesian approach + MCMC method.
MaxEnt 2006, Paris
2/14
Summary
Introduction 1. Problem Formulation 2. Model Definition – A Constant Dimension Model – Prior Distributions – Conditional Posterior Distributions – Peak Location Simulation
3. Label Switching 4. Application Conclusion
MaxEnt 2006, Paris
3/14
1. Problem Formulation Marked point process: finite set of objects lying in a bounded space and characterized by their locations and some marks. ➜ Blind sparse spike train deconvolutionp
⋆
=
→ Bernoulli-Gaussian process (widespread model for sparse spike trains) Drawbacks: • common implementation with MCMC methods not efficient • peaks located on discrete positions • one peak shape ➜ Decomposition into elementary patterns
+ →y=
PK
MaxEnt 2006, Paris
=
+
k=1 f (nk , wk , sk )
+e 4/14
2. Proposed Model 2.1 A Constant Dimension Model
y=
K X
Problem: peak number unknown ⇒ system order likely to change!
f (nk , wk , sk ) + e
k=1
➜ MCMC techniques for model uncertainty (RJMCMC algorithm, ...) ➜ Constant Dimension Model peak number equals to constant Kmax (upper bound fixed by the user). Bernoulli-Gaussian model → q ∼ Ber(λ) codes the peak occurrences: • qk = 1: the kth peak is present (at nk with amplitude wk and width sk ) • qk = 0: the kth peak is not present ⇒
y=
K max X
f (nk , wk , sk ) + e
k=1
➜ variable number smaller than a common BG implementation (3Kmax vs. N ). ➜ allows to use Gibbs sampler MaxEnt 2006, Paris
5/14
2. Proposed Model 2.2 Prior distributions
Noise: white, Gaussian and i.i.d.
e ∼ N (0, reI)
Peak Location: uniformly distributed on [1, N ]
nk ∼ U[1,N ]
Peak Amplitude: BG process + positive amplitudes
Peak Width: inverse gamma with mean 6 cm−1 and variance 2.5 cm−1
MaxEnt 2006, Paris
qk ∼ Ber(λ) δ (w ) if qk = 0 0 k wk |qk ∼ N +(0, rw ) if qk = 1 sk ∼ IG(αs, βs)
6/14
2. Proposed Model 2.2 Prior distributions
➜ Hyperparameters:
Bernoulli parameter: conjugate prior to penalize high values
λ ∼ Be(1, Kmax + 1)
Peak Amplitude Variance: conjugate prior less informative as possible
rw ∼ IG(αw , βw )
Noise variance: Jeffreys prior
re ∼ 1/re
MaxEnt 2006, Paris
7/14
2. Proposed Model 2.3 Conditional Posterior distributions
Peak Location: Peak Amplitude: Peak Amplitude:
Peak Width:
¶ ¯¯ ¯ ¯ 2 PKmax ¯¯ 1 ¯¯ nk | . . . ∼ exp − 2re ¯¯y − l=1 f (nl, wl, sl)¯¯ 1[1,N ](nk ) µ
qk | . . . ∼ Ber(λk ) δ (w ) 0 k wk | . . . ∼ N +(µk , ρk )
if qk = 0 if qk = 1
¶ ¯¯ ¯ ¯2 P ¯¯ ¯¯ Kmax sk | . . . ∼ exp − 2r1e ¯¯y − l=1 f (nl, wl, sl)¯¯ − βs s αs1+1 1R+ (sk ) k s µ
k
¡ ¢ Bernoulli parameter: λ| . . . ∼ Be K + 1, 2Kmax − K + 1 ³ ´ K wT w Peak Amplitude Variance: rw | . . . ∼ IG 2 + αw , 2 + βw Noise variance: MaxEnt 2006, Paris
re| . . . ∼ IG
µ
¯¯ ¯¯ 2 ¶ P ¯¯ Kmax N 1 ¯¯ , y − f (n , w , s ) ¯ ¯ l l l ¯¯ l=1 2 2
8/14
2. Proposed Model 2.4 Peak Location Simulation
¯¯ ¯¯2 ¶ P ¯¯ ¯¯ Kmax nk | . . . ∼ exp − 2r1e ¯¯y − l=1 f (nl, wl, sl)¯¯ 1[1,N ](nk ) µ
Metropolis-Hastings algorithm ➜ If the peak is present (qk = 1) define precisely its location:
→
which proposal distribution?
(i−1)
, rn )
N [1,N ](nk
➜ If the peak is absent (qk = 0) explore the entire space: U[1,N ] ⇒
MaxEnt 2006, Paris
(i−1)
q(e nk ) = δ0(qk )U[1,N ] + δ1(qk )N [1,N ](nk
, rn).
9/14
3. Label Switching The label switching problem is due to 2 phenomena: • same posterior for all permutation of k:
p(θ1, θ2, θ3|y) = p(θ2, θ3, θ1|y)
• Gibbs sampler able to explore the k! permutation possibilities
6
5
θ
4
3
2
1
0 1
MaxEnt 2006, Paris
2
3
θb1 = 4.26,
4
5
6
iterations
7
θb2 = 4.34,
8
9
10
θb3 = 2.41
10/14
3. Label Switching
Proposed Method Minimizing the following cost function (see [Stephens 1997]): L0(n, w, s, µn, ρn, µw , ρw , µs, ρs) = − ln
"K max Y
N (nk |µnk , ρnk )N (wk |µw k , ρw k )N (sk |µsk , ρsk )
k=1
#
Major differences to general relabelling algorithms: • initialization obtained by selecting the maximum in the histogram of (µn, µw , µs) (closer to the global optimum than a simple identity permutation) • relabelling (nl, wl, sl) one after the other (no permutation) • taking into account the fact that the peak number is expected to change
MaxEnt 2006, Paris
11/14
3. Label Switching b MMAP K
b MMAP for l = 1, . . . , K
histogram
while the selection changes for i = 1, . . . , I
θ = (µn, µw , µs)
selection of (i)
θl
update (µθ , ρθ )
MaxEnt 2006, Paris
θb = µθ
12/14
4. Application 600
intensity (arbitrary unit)
500
400
300
200
100
0 700
800
900
1000
wavenumber (cm
−1
1100
)
Raman spectrum of gibbsite Al(OH)3 ➜ 10,000 iterations (burn-in period of 5,000 iterations). (0) (0) ➜ Initialization: spectrum with no peak, λ(0) = 0.5, rw = 10, re = 0.1. MaxEnt 2006, Paris
13/14
Conclusion ➜ Signal decomposition into elementary patterns (marked point process) Alternative to blind sparse spike train deconvolution • more efficient than a common implementation with BG model • peaks located on a continuous space • peak with different shapes ➜ Constant dimension model Alternative to RJMCMC ➜ New method for label switching • initialization close to the global optimum using an histogram • relabelling with no permutation • the variable number may change
MaxEnt 2006, Paris
14/14