On bounds for the Fisher-Rao distance between multivariate normal

sidering the statistical model of multivariate nor- mal distributions M as a Riemannian manifold with the natural metric provided by the Fisher information matrix.
209KB taille 5 téléchargements 312 vues
On bounds for the Fisher-Rao distance between multivariate normal distributions João E.

∗ Strapasson ,

Information geometry is approached here by considering the statistical model of multivariate normal distributions M as a Riemannian manifold with the natural metric provided by the Fisher information matrix. This differential geometric approach to probability theory, introduced by C. Rao in 1945, have been recently applied to different domains such as statistical inference, mathematical programming, image processing and radar signal processing [4]. Explicit forms for the Fisher-Rao distance associated to this metric and for the geodesics of general distribution models are usually very hard to determine. In the case of general multivariate normal distributions lower and upper bounds have been derived. We approach here some of these bounds and introduce a new one discussing their tightness in specific cases.

For multivariate normal distributions,

i=1

λi are the eigenvalues of Σ−1 1 Σ2 . Mµ is a geodesic submanifold but MD and MΣ are not.

LOWER AND UPPER BOUNDS A lower bound [2]: v u n u1 X θ1, θ 2) > LB(θ θ 1, θ 2) = t dF(θ [log(λi)]2, 2 i=1

Figure 2: The mean µ 2 = (16, 16) is fixed, λ1 = 1, λ2 = 10 and τ varies from 0 to π4 .

where λi are the eigenvalues of S−1 1 S2,   t t Σi + µ i µ i µ i . Si = µi 1

  t −1 x µ x µ − (x −µ ) Σ2 (x −µ )

 √ α 1 1 t n−1 2 θ1, θ 2) = 2 arccosh + √ + √ δδ + log2 α, dα(θ 2 2 2 α 4 α

,

where x = (x1, · · · , xn)t, µ = (µ1, · · · , µn)t is the mean vector and Σ is the covariance matrix, we conµ, Σ) ∈ Θ = sider the statistical model M = {pθ ; θ = (µ Rn × Pn(R)}, where Pn(R) is the space of positive definite order n symmetric matrices, is n + n(n+1) 2 dimensional manifold. A natural Riemannian structure [1] can be provided by the Fisher information matrix:       ∂ ∂ θ) = Eθ log p(x; θ) log p(x; θ) , gij(θ ∂θi ∂θj where Eθ is the expected value with respect to the distribution pθ . The Fisher-Rao distance between two distributions pθ 1 and pθ 2 is then given by the shortest length, `(α) = R t2 1 0 0 t 2 x y x y is the (hα ) (t), α (t)i dt, where hx , i = [gij]y G G t1 inner product defined G. The Fisher-Rao infinitesimal arc-length can be expressed as 1 µ Σ dµ µ + tr[(Σ−1dΣ)2]. ds = dµ (1) 2 An explicit expression for the distance between two general normal multivariate distributions is very hard obtain. Closed forms: • In the case n = 1, a closed form for the Fsher-Rao distance is known via an association with the classical model of the hyperbolic half-plane. • The submanifold MD where the covariance matrix is diagonal, Σ = diag(σ21, σ22, · · · , σ2n). The Fisher matrix [3]:   1 0 ··· 0 0 σ21   2  0 σ2 · · · 0 0    1 .. .. . . . .. ..  . θ) =  gij(θ    0 0 ··· 1 0    σ2n 0 0 · · · 0 σ22 2

n X 1 2 dµ (Σ1, Σ2) = [log(λi)]2, 2

Upper bounds: Upper bounds derived in [2] conµ1, Σ) and sidering triangular inequalities. For θ 1 = (µ µ2, αΣ), θ 2 = (µ

PRELIMINARES

(2π) x; µ , Σ) = p p(x e Det(Σ)

and Sueli I. R.

† Costa

• The submanifold Mµ where µ is constant,

INTRODUCTION

−( n2 )

Julianna P. S.

† Porto

t −1

− 12

µ2 − µ 1). where δ = Σ (µ θ1, θ 2) 6 dα(θ θ1, θ α) + dµ (θ θα, θ 2). dF(θ − 21 1 − 21 kΣ1 Σ2Σ1 k n .

Upper Bound U1 U1: α = θ1, θα) + dµ (θ θα, θ2)}. Upper Bound U2 U2: α = minα{dαΣ(θ We proposed a new upper bound, U3, based on special isometries in the manifold M and on the distance in the submanifold MD. µ1, Σ1) and θ2 = (µ µ2, Σ2) we Proposition 1. For θ1 = (µ can assert:   θ1, θ2) = dF (0, In); (µ µ3, D3) . dF(θ (2) where 0 is the zero mean, In is the identity matrix, D3 is the diagonal matrix given by the eigenvalues −(1/2) −(1/2) of A = Σ1 Σ2 Σ1 , Q is the orthogonal matrix whose columns are the respective eigenvectors of A −(1/2) µ2 − µ 1). (A = QD3Qt) and µ 3 = QtΣ1 (µ Proposition 2. The Fisher-Rao distance between two µ1, Σ1) and multivariate normal distributions θ 1 = (µ µ2, Σ2) is upper bounded by, θ 2 = (µ n q X θ 1, θ 2) = 2(log Ci)2, (3) U3(θ i=1

q 2 2 2 2 |1 − D3i| + |µ3i| + |D3i + 1| + |µ3i| q Ci = q , |D3i + 1|2 + |µ3i|2 − |1 − D3i|2 + |µ3i|2

Figure 4: The mean µ 2 = (0.5, 0.5) and the angle τ = π4 are fixed, λ1 = 1 and λ2 = 0.5 + 0.1j, where j varies from 0 to 10.

q

where µ3i are the coordinates of µ 3 and D3i are the diagonal terms of D3, as established in the last proposition.

Simulations for n = 2

We observe that U3 is sensitively tighter than U1 and U2 particularly in the cases of discrepancy between eigenvalues and/or rotations of eigenvectors frames.

REFERENCES [1] S. Amari, and H. Nagaoka, Methods of Information Geometry,T. M. M., Vol.191, Am. Math. Soc., 2000. [2] M. Calvo, and J.M. Oller, Journal of Multivariate Analysis 35.2, 223-242, 1990, Statistics and Decisions 9, 119-138, 1991. [3] S. I. R. Costa, S. A. Santos, and J. E. Strapasson, Fisher information distance: a geometrical reading,arXiv,(to appear)Discrete Applied Mathematics. [4] F. Nielsen and F. Barbaresco (Eds.), Geometric Science of Information, First International Conference, GSI 2013 Paris, France, August 2013, Lecture Notes in Computer Science 8085 (2013).

n

v u n u X θ 1 , θ 2 ) = t2 dD(θ (log Bi)2, i=1

        2 √ √ µ2i µ1i µ2i µ1i √ √ 2 , σ1i − 2 , −σ2i + 2 , σ1i − 2 , σ2i        . Bi =  √ µ1i µ2i µ1i µ2i √ √ √ 2 , σ1i − 2 , −σ2i − 2 , σ1i − 2 , σ2i • The submanifold MΣ where Σ is constant, q µ1, µ 2) = (µ µ1 − µ 2)tΣ−1(µ µ1 − µ 2). dΣ(µ

Figure 3: The mean µ2 = (4, 4) and the angle τ = 0 are fixed, λ1 = k, λ2 = 10k and k varies from 1 to 8.

Figure 1: The mean µ2 = (2, 2) is fixed, λ1 = 1, λ2 = 10 and τ varies from 0 to π4 .

Email address: [email protected] (Sueli I. R. Costa).

∗School of Applied Sciences, University of Campinas, Brazil †Institute of Mathematics, University of Campinas, Brazil

Suporte financeiro: