decomposition of a chemical spectrum using a marked point process

Conditional Posterior Distributions. â Peak Location Simulation. 3. .... Gibbs sampler able to explore the k! permutation possibilities. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 0. 1.

Télécharger le PDF

122KB taille 2 téléchargements 342 vues

commentaire

Report

International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering

DECOMPOSITION OF A CHEMICAL SPECTRUM USING A MARKED POINT PROCESS AND A CONSTANT DIMENSION MODEL

V. Mazet, D. Brie, J. Idier

CRAN UMR 7039, Nancy University, CNRS,

´ IRCCyN UMR 6597, Ecole Centrale de Nantes, CNRS,

BP 239, 54506 Vandœuvre-l`es-Nancy Cedex, France [email protected]

1 rue de la No¨e, BP 92101, 44321 Nantes Cedex 3, France [email protected]

intensity (arbitrary unit)

Introduction

3000 2000 1000 0 −1000 200

400

600

800

1000

1200

wavenumber (cm−1)

Goal: estimating the peak parameters (locations, amplitudes and widths) in a spectrum. ➜ Provide an interpretation for physico-chemists. ➜ Bayesian approach + MCMC method.

MaxEnt 2006, Paris

2/14

Summary

Introduction 1. Problem Formulation 2. Model Definition – A Constant Dimension Model – Prior Distributions – Conditional Posterior Distributions – Peak Location Simulation

3. Label Switching 4. Application Conclusion

MaxEnt 2006, Paris

3/14

1. Problem Formulation Marked point process: finite set of objects lying in a bounded space and characterized by their locations and some marks. ➜ Blind sparse spike train deconvolutionp

⋆

=

→ Bernoulli-Gaussian process (widespread model for sparse spike trains) Drawbacks: • common implementation with MCMC methods not efficient • peaks located on discrete positions • one peak shape ➜ Decomposition into elementary patterns

+ →y=

PK

MaxEnt 2006, Paris

=

+

k=1 f (nk , wk , sk )

+e 4/14

2. Proposed Model 2.1 A Constant Dimension Model

y=

K X

Problem: peak number unknown ⇒ system order likely to change!

f (nk , wk , sk ) + e

k=1

➜ MCMC techniques for model uncertainty (RJMCMC algorithm, ...) ➜ Constant Dimension Model peak number equals to constant Kmax (upper bound fixed by the user). Bernoulli-Gaussian model → q ∼ Ber(λ) codes the peak occurrences: • qk = 1: the kth peak is present (at nk with amplitude wk and width sk ) • qk = 0: the kth peak is not present ⇒

y=

K max X

f (nk , wk , sk ) + e

k=1

➜ variable number smaller than a common BG implementation (3Kmax vs. N ). ➜ allows to use Gibbs sampler MaxEnt 2006, Paris

5/14

2. Proposed Model 2.2 Prior distributions

Noise: white, Gaussian and i.i.d.

e ∼ N (0, reI)

Peak Location: uniformly distributed on [1, N ]

nk ∼ U[1,N ]

Peak Amplitude: BG process + positive amplitudes

Peak Width: inverse gamma with mean 6 cm−1 and variance 2.5 cm−1

MaxEnt 2006, Paris

qk ∼ Ber(λ)  δ (w ) if qk = 0 0 k wk |qk ∼ N +(0, rw ) if qk = 1 sk ∼ IG(αs, βs)

6/14

2. Proposed Model 2.2 Prior distributions

➜ Hyperparameters:

Bernoulli parameter: conjugate prior to penalize high values

λ ∼ Be(1, Kmax + 1)

Peak Amplitude Variance: conjugate prior less informative as possible

rw ∼ IG(αw , βw )

Noise variance: Jeffreys prior

re ∼ 1/re

MaxEnt 2006, Paris

7/14

2. Proposed Model 2.3 Conditional Posterior distributions

Peak Location: Peak Amplitude: Peak Amplitude:

Peak Width:

¶ ¯¯ ¯ ¯ 2 PKmax ¯¯ 1 ¯¯ nk | . . . ∼ exp − 2re ¯¯y − l=1 f (nl, wl, sl)¯¯ 1[1,N ](nk ) µ

qk | . . . ∼ Ber(λk )  δ (w ) 0 k wk | . . . ∼ N +(µk , ρk )

if qk = 0 if qk = 1

¶ ¯¯ ¯ ¯2 P ¯¯ ¯¯ Kmax sk | . . . ∼ exp − 2r1e ¯¯y − l=1 f (nl, wl, sl)¯¯ − βs s αs1+1 1R+ (sk ) k s µ

k

¡ ¢ Bernoulli parameter: λ| . . . ∼ Be K + 1, 2Kmax − K + 1 ³ ´ K wT w Peak Amplitude Variance: rw | . . . ∼ IG 2 + αw , 2 + βw Noise variance: MaxEnt 2006, Paris

re| . . . ∼ IG

µ

¯¯ ¯¯ 2 ¶ P ¯¯ Kmax N 1 ¯¯ , y − f (n , w , s ) ¯ ¯ l l l ¯¯ l=1 2 2

8/14

2. Proposed Model 2.4 Peak Location Simulation

¯¯ ¯¯2 ¶ P ¯¯ ¯¯ Kmax nk | . . . ∼ exp − 2r1e ¯¯y − l=1 f (nl, wl, sl)¯¯ 1[1,N ](nk ) µ

Metropolis-Hastings algorithm ➜ If the peak is present (qk = 1) define precisely its location:

→

which proposal distribution?

(i−1)

, rn )

N [1,N ](nk

➜ If the peak is absent (qk = 0) explore the entire space: U[1,N ] ⇒

MaxEnt 2006, Paris

(i−1)

q(e nk ) = δ0(qk )U[1,N ] + δ1(qk )N [1,N ](nk

, rn).

9/14

3. Label Switching The label switching problem is due to 2 phenomena: • same posterior for all permutation of k:

p(θ1, θ2, θ3|y) = p(θ2, θ3, θ1|y)

• Gibbs sampler able to explore the k! permutation possibilities

6

5

θ

4

3

2

1

0 1

MaxEnt 2006, Paris

2

3

θb1 = 4.26,

4

5

6

iterations

7

θb2 = 4.34,

8

9

10

θb3 = 2.41

10/14

3. Label Switching

Proposed Method Minimizing the following cost function (see [Stephens 1997]): L0(n, w, s, µn, ρn, µw , ρw , µs, ρs) = − ln

"K max Y

N (nk |µnk , ρnk )N (wk |µw k , ρw k )N (sk |µsk , ρsk )

k=1

#

Major differences to general relabelling algorithms: • initialization obtained by selecting the maximum in the histogram of (µn, µw , µs) (closer to the global optimum than a simple identity permutation) • relabelling (nl, wl, sl) one after the other (no permutation) • taking into account the fact that the peak number is expected to change

MaxEnt 2006, Paris

11/14

3. Label Switching b MMAP K

b MMAP for l = 1, . . . , K

histogram

while the selection changes for i = 1, . . . , I

θ = (µn, µw , µs)

selection of (i)

θl

update (µθ , ρθ )

MaxEnt 2006, Paris

θb = µθ

12/14

4. Application 600

intensity (arbitrary unit)

500

400

300

200

100

0 700

800

900

1000

wavenumber (cm

−1

1100

)

Raman spectrum of gibbsite Al(OH)3 ➜ 10,000 iterations (burn-in period of 5,000 iterations). (0) (0) ➜ Initialization: spectrum with no peak, λ(0) = 0.5, rw = 10, re = 0.1. MaxEnt 2006, Paris

13/14

Conclusion ➜ Signal decomposition into elementary patterns (marked point process) Alternative to blind sparse spike train deconvolution • more efficient than a common implementation with BG model • peaks located on a continuous space • peak with different shapes ➜ Constant dimension model Alternative to RJMCMC ➜ New method for label switching • initialization close to the global optimum using an histogram • relabelling with no permutation • the variable number may change

MaxEnt 2006, Paris

14/14

decomposition of a chemical spectrum using a marked point process

des documents recommandant