Nested sampling with demons

summing over prior mass ... Distribution of prior masses at likelihood contour λ: .... Nested sampling provides a very accurate estimate of the volume entropy.
1MB taille 3 téléchargements 532 vues
Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and Institute for Mathematical Stochastics Göttingen, Germany

Amboise, September 23, 2014

Bayesian inference





Probability rules posterior × evidence

=

likelihood × prior

Pr(θ|D, M) × Pr(D|M)

=

Pr(D|θ, M) × Pr(θ|M)

p(θ) × Z

=

Inference • Evidence

L(θ) × π(θ)

∫ Z=

• Posterior p(θ) =

L(θ) π(θ) dθ

1 L(θ) π(θ) Z

Nested sampling •

The evidence reduces to a one-dimensional integral:

∫ Z=



1

L(θ) π(θ) dθ =

L(X) dX 0

summing over prior mass

∫ X(λ) =

π(θ) dθ, L(θ)≥λ

X(0) = 1, X(∞) = 0.

Nested sampling



The evidence reduces to a one-dimensional integral:

∫ Z=



1

L(θ) π(θ) dθ =

L(X) dX 0

summing over prior mass

∫ X(λ) =

π(θ) dθ,

X(0) = 1, X(∞) = 0.

L(θ)≥λ



Prior masses can be ordered: X(λ) < X(λ′ )



if

λ > λ′

Idea: We can evaluate L exactly and estimate X

Estimation of prior masses



Nested sequence of truncated priors: p(θ|λ) =



Θ[L(θ) − λ] π(θ) where X(λ)

{ Θ(x) =

0; 1;

Distribution of prior masses at likelihood contour λ: X ∼ Uniform(0, X(λ))



Order statistics: Xmax ∼ N

X N−1 X(λ)N

where Xmax is the maximum of N uniformely distributed Xn ∼ Uniform(0, X(λ))

x λ

energy bound E(θ) < ϵ

prior mass X(λ) ∫ evidence Z = L(X )dX

cumulative DOS X(ϵ) =

truncated prior

microcanonical ensemble

∫ϵ −∞

partition function Z(β) =



g(E) dE

e−β E g(E) dE

Microcanonical ensemble



Density of states (DOS)

∫ g(E ) =



δ[E − E(θ)] π(θ) dθ = ∂E X(E )

Microcanonical entropy and temperature: S(E ) = ln X(E ),



T(E ) = 1/∂E S(E )

Compression: H(ϵ′ → ϵ) = S(ϵ′ ) − S(ϵ) =



ϵ′

β(E ) dE ϵ

where the inverse temperature β = 1/T measures the entropy production

Enter the demon •

Implement truncated prior as microcanonical ensemble with additional demon absorbing energy D : p(θ, D |ϵ) =

1 δ[ϵ − D − E(θ)] Θ(D) π(θ) X(ϵ)

and explore constant energy shells



Creutz algorithm: Require: ϵ (upper bound on total energy)

θ ∼ π(θ) with energy E = E(θ) ≤ ϵ, D = ϵ − E



Initialize

while not converged do

θ ′ ∼ π(θ) with energy E ′ = E(θ ′ )



D ′ = D − ∆E where ∆E = E ′ − E



if D ′ ≥ 0 then (θ, D) ← (θ ′ , D ′ ) end if end while

Generate a candidate Update demon’s state



Accept

Sampling the Ising model with a single demon

Gibbs entropy SG (E)

0 500

A

1000 1500 2000 2500 8000

estimated lnXk 6000

4000

energy E

2000

0





Nearest-neighbor interaction on a 64 × 64 lattice: E(θ) = θi = ±1



Nested sampling provides a very accurate estimate of the volume entropy S = ln X

⟨i,j ⟩ θi θj

where

Sampling the Ising model with a single demon

inverse temperature βG (E)

1.0

B

heat capacity p ln(1 + 2)/2

0.8 0.6 0.4 0.2 0.08000

6000





4000

energy E

2000

0

• H(ϵ′ → ϵ) = S(ϵ′ ) − S(ϵ) = ϵϵ β(E ) dE • ⟨H(ϵ′ → ϵ)⟩ = 1/N, therefore β(ϵk ) (ϵk − ϵk+1 ) ≈ 1/N • histogram of energy bounds ϵk matches the inverse temperature / entropy production β(E )

Sampling the Ising model with a single demon

inverse temperature βG (E)

1.0

C

0.6 0.4 0.2 0.08000



estimated βB

0.8

6000

2000

0

The demon’s energy distribution is

∫ p(D|ϵ) =



4000

energy E

p(θ, D|ϵ) dθ =

g(ϵ − D) 1 ≈ exp{−D/T(ϵ)} X(ϵ) T(ϵ)

The demon may serve as a thermometer: D ≈ T

Properties of nested sampling Pros: 1. Nested sampling is a microcanonical approach: energy E is the control parameter rather than the temperature used in thermal approaches 2. constructs an adaptive “cooling” protocol {ϵk } 3. progresses at constant thermodynamic speed: ∆S ≈ 1/N 4. provides an estimate of the entropy S Cons: 1. Nested sampling requires efficient sampling from p(θ|ϵ)

= =

1 Θ[ϵ − E(θ)] π(θ) X(ϵ) ∫ 1 δ[ϵ − D − E(θ)] Θ(D) π(θ) dD X(ϵ)

Releasing more demons



We would like to preserve nested sampling’s adaptive behavior but be more flexible in terms of the ensemble



Idea: introduce more demons in order to smooth the ensemble p(θ, D, K |ϵ) =

1 δ[ϵ − D − K − E(θ)] Θ(D ) f(K ) π(θ) Y(ϵ)

where the prior mass of the compound system is

∫ Y(ϵ) =

Θ(ϵ − H ) (f ⋆ g)(H ) dH

involving the convolution (f ⋆ g)(H )



Nested sampling tracks Y(ϵ) where ϵ is an upper bound on the total energy H = K + E

Releasing more demons •

Nested sampling estimates the evidence of the extended system

∫ ZH =

e−H (f ⋆ g)(H ) dH = ZK ZE

from which we can obtain the evidence of the original system ZE



Marginal distribution of configurations

∫ p(θ|ϵ) =

where F(K ) =



∫K

−∞

p(θ, D, K |ϵ) dD dK =

1 π(θ) F [ϵ − E(θ)] Y(ϵ)

f(t) dt is the cdf of the demon’s energy distribution

Sampling (θ, K ):

θ



p(θ|ϵ)

K



p(K |θ, ϵ) ∝ f(K ) Θ[ϵ − E(θ) − K ]

∝ π(θ) F [ϵ − E(θ)]

Demonic nested sampling of the ten state Potts model

Demon: f(K ) ∝ Θ(Kmax − K) Kd/2−1 (d-dimensional harmonic oscillator where d =dimension of configuration space)

{ p(θ|ϵ) ∝ Θ[ϵ − E(θ)] π(θ) × 1e3

A

standard NS demonic NS

0.5

ϵ − E(θ) ≤ Kmax

d/2

ϵ − E(θ) > Kmax

Kmax ;

6 1e3 5

B

relative accuracy logZ [%]

energy bounds ²k

0.0

[ϵ − E(θ)]d/2 ;

4

1.0

3 2

1.5

1 2.0 0.0

0.5

1.0

1.5

2.0

iteration k

2.5

3.0

3.5 1e5

0 1080 1060 1040 1020 1000 980 960 940

energy E

10

C

5 0 5 10 0

100

200

300

400

demon capacity Kmax

500

Nested sampling in phase space •

In continuous configuration spaces, it is convenient to unfold the demon and introduce momenta

∫ f(K) =

δ[K − K(ξ)] dξ

where

K(ξ) =

d 1∑ 2 ξi 2

(kinetic energy)

i=1



The marginal distribution in configuration space is

{ p(θ|ϵ) ∝ Θ[ϵ − E(θ)] π(θ) ×



[ϵ − E(θ)]d/2 ;

ϵ − E(θ) ≤ Kmax

d/2

ϵ − E(θ) > Kmax

Kmax ;

Hamiltonian dynamics for exploration: L

(θ, ξ) → (θ ′ , ξ ′ ) where L is an integrator (e.g. the leapfrog)

Microcanonical Hamiltonian Monte Carlo

• 2(d + 1) dimensional phase space:

implement demon D as harmonic oscillator with energy D = (ξd2+1 + θd2+1 )/2



Require: ϵ (total energy), configuration θ with E(θ) < ϵ

θd+1 = 0



Initialize demon D

while not converged do

ξ ∼ N(0, 1) √ ξ ← ξ × ϵ − E − D/∥ξ∥

▷ ▷

Draw momenta from (d + 1)-dim Gaussian

Scale momenta so as to match excess energy

L

(θ, ξ) → (θ′ , ξ ′ ) ′

H = E(θ ′ ) + K(ξ ′ )

▷ ▷

Run the leapfrog algorithm

Compute total energy of candidate

if H ′ < ϵ then

θ ← θ ′ , E ← E(θ) end if end while



Accept

Application to GS peptide

800 700 600 500 400 300 200 100 0 1000.0 0.2 0.4 0.6 0.8 1.0 1.2 iteration k 1e4

energy E(θk )

B

• • •

200

C

150 100 50 00

1

2

3

RMSD [ ]

4

5

A: Native structure of the GS peptide B: Evolution of the energy (goodness-of-fit) during nested sampling C: Structure’s accuracy measured by the root-mean square deviation (RMSD) to the crystal structure

Other demons Distribution of system’s energy p(E |ϵ) =

demon Gauss

Θ(ϵ − E ) g(E ) F(ϵ − E ) Y(ϵ)

pdf f(K) √ 2 β −β e 2K 2π

oscillator

K(ξ1 , ξ2 ) = 12 ξ12 + β −1 ln |ξ2 | √ ⇒ 8πβ eβ K

Fermi

β (1+ee−β K )2

Logarithmic

−β K

cdf F(K) 1 [1 2



√ + erf( β/2 K)]

8π/β eβ K

1 1+e−β K

Application to SH3 domain



Structure determination from sparse distance data measured by NMR spectroscopy



Structure ensemble as accurate and precise as with parallel tempering

Summary



Nested sampling is a powerful method to study the microcanonical ensemble



By means of demons we can smooth the microcanonical ensemble, which eases the exploration of configuration space



All of the desired features of nested sampling are preserved