Information Intrinsic Geometric Flows

... PDE) and Mathematical Physic (calculus of variations, General Relativity, Einstein ...... We can introduce another Intrinsic Geometric Flow, which has been ...
279KB taille 5 téléchargements 343 vues
MaxEnt’06, Paris , July 2006

Information Intrinsic Geometric Flows : Kähler-Ricci & Calabi Flows on Siegel & Hyper-Abelian Metrics of Complex Autoregressive Models Frédéric BARBARESCO, Thales Air Defence, 7/9 rue des Mathurins F-92223 Bagneux, France E-mail : [email protected] , Phone : 33.(0)1.40.84.20.04

1. Preambule Geometric Flow Theory is cross fertilized by diverse elements coming from Pure Mathematic (geometry, algebra, analyse, PDE) and Mathematical Physic (calculus of variations, General Relativity, Einstein Manifold, String Theory), but its foundation is mainly based on Riemannian Geometry, as explained by M. Berger in a recent panoramic view of this discipline [Berger], its extension to complex manifolds, the Erich Kähler’s Geometry [Kähler1], vaunted for its unabated vitality by J.P. Bourguignon [Bourguignon] in [Kähler2], and Minimal Surface Theory recently synthetized by F. Hélein [Helein]. This paper would like to initiate seminal studies for applying intrinsic geometric flows in the framework of information geometry theory. More specifically, after having introduced Information metric deduced for Complex Auto-Regressive (CAR) models from Fisher Matrix (Siegel Metric and Hyper-Abelian Metric from Entropic Kähler Potential), we study asymptotic behaviour of PARCOR parameters (reflexion coefficients of CAR models) driven by intrinsic Information geometric Kähler-Ricci and Calabi flows. These Information geometric flows can be used in different contexts to define distance between CAR models interpreted as geodesics of Entropy Manifold (e.g : distance between plane curves parametrized by CAR models).

2. Siegel Metric for Complex Autoregressive Model Chentsov has defined main axioms of Information Geometry. In this Theory, we consider families of T parametric density functions GΘ = {p (. / θ ) : θ ∈ Θ} with Θ = [θ 1 L θ n ] , from which we can define a Riemannian Manifold Structure by meam of Fisher Information matrix (g ij (θ ) )ij : n  ∂ ln p(. / θ ) ∂ ln p (. / θ )  2 g ij (θ ) = Eθ  . , with the Riemannian metric : ds = g ij (θ ).dθ i .dθ *j  ∑ * ∂θ i ∂θ j i , j =1   This metric can also be naturally introduced by a Taylor expansion of Kullback Divergence : ~ + ~ 2  ∂K (θ ,θ )  +  ∂ K (θ , θ )  ~ 1 1 ~ ~ ~ *    K (θ ,θ ) θ~ =θ + dθ ≅ K (θ ,θ ) +  ~ ~ *  θ − θ ≅ ∑ g ij (θ )dθ i .dθ j ~  θ −θ + θ −θ  2 2 i, j  ∂θ θ~ =θ  ∂θ ∂θ  We demonstrate easily that this Fisher metric is equivalent to the Siegel metric, introduced by Siegel in the 60’s in the framework of Symplectic Geometry. Indeed, if we consider a Complex Multivariate Gaussian Law : ˆ −1 −1 −n + p( X/Rn ,mn ) = (2π ) . Rn .e −Tr [Rn .Rn ] with Rˆ n = ( X − mn )(. X − mn ) such that E Rˆ n = Rn

(

) (

)

(

)

[

[ ]

]

it is well-known that the Fisher Information matrix is given by : g ij (θ ) = −Tr ∂ i Rn .∂ j Rn−1 + ∂ i mn+ .Rn−1 .∂ j mn

In the following, we will only consider random process with zero mean mn = E [X ] = 0 , and so if we apply

the following relation Rn .Rn−1 = I n ⇒ ∂Rn = − Rn .∂Rn−1 .Rn , the Fisher matrix is reduced to :

[(

)(

)]

g ij (θ ) = Tr Rn .∂ i Rn−1 . Rn .∂ j Rn−1 with the associated Riemannian metric :

     ds 2 = ∑ g ij (θ ).dθ i .dθ j = Tr R n . ∑ ∂ i Rn−1 .dθ i .Rn . ∑ ∂ j Rn−1 .dθ j  with dR -n1 = ∑ ∂ k Rn−1 .dθ k i, j k   j    i

[(

)] 2

We can then observe that it is completely equivalent with Siegel Metric : ds 2 = Tr Rn .dRn−1 introduced by Karl Ludwig Siegel in his book « Symplectic Geometry ». This metric is invariant under the action of the following group (GLn (C ),.) : Rn → Wn .Rn .Wn+ , Wn ∈ GLn (C ) , and geodesics are given by :

MaxEnt’06, Paris , July 2006

 S& ( s ) = S ( s ).H with S (0) = R1−1  −1 / 2 H . s −1 / 2 S s = R e R ( ) . . 1 1 

From this metric, if we take the Frobenius Norm X =

X,X

avec

[

]

X, Y = Tr X .Y + , distance between

2 covariances matrices Rn(1) −1 and Rn( 2 )-1 ∈ Pn +1 (C ) is given by Jensen distance : 2

d (R

(1) −1 n

,R

( 2 ) −1 n

(

) = ln R

(1)1 / 2 n

.R

( 2 ) −1 n

.R

(1)1 / 2 n

)

2

(

n

)

= ∑ ln 2 λ(i n ) with det Rn(1)1 / 2+ .Rn( 2) −1 .Rn(1)1 / 2 − λ(i n ) .I n = 0 i =1

In case of Complex Autoregressive models, we can exploit this specific blocks structure of covariances matrices and prove that :  β n −1  β n −1 .Wn+−1 with Wn -1 = α n(1−)1 .Rn(1−)11 / 2+ . An( 2−1) − An(1−)1 Ω n = Rn(1)1 / 2+ .Rn( 2 ) −1 .Rn(1)1 / 2 =  +   β n −1 .Wn −1 Ω n −1 + β n −1 .Wn −1 .Wn −1   A(−)  α ( 2) A  2 and β n -1 = n(1−)1 where α n−1 = 1 − µ n .α n−−11 and An =  n −1  + µ n . n −1  α n −1  0   1  We easily observe that Jensen Metric can then be easily computed recursively to the CAR model order by using this following equations giving interleaving eigenvalues Λ n = diag L λ(i n ) L of

[

[

]

]

{

}

Ω n = Rn(1)1 / 2+ .Rn( 2)−1 .Rn(1)1 / 2 at successive order : Wn+−1 . X i( n −1)

2

1  X k( n )  =  (n) −1  (n) + (n) ( n −1) (n) − λk X k ,1 − λ k .U n −1 . Λ n −1 − λ k .I n −1 .U n −1 .Wn −1  i =1 λi Always based on Blocks structure of covariance matrices, we can deduce a recursive expression of the metric

( )

n −1

F ( n ) λ(kn ) = λ(kn ) − β n −1 + β n −1 .λ(kn ) .∑

(

)

= 0 and

(

)

2

: ds = ds 2 n

2 n −1

 dα  +  n −1  + α n −1 .dAn+−1 .Rn−1 .dAn −1 . From which, we define a new hyperbolic distance betwwen  α n−1 

 dα CAR models as Inferior Bound of this metric : ds > ∑  k k =0  α k 2 n

n −1

2

n −1 dµ  k  + ∑ k =1 1 − µ k 

2 2

3. Erich Kähler Geometry with Information metric based on Entropic Hyper-Abelian Kähler Potential Natural extension of Riemannian Geometry to Complex Manifold has been introduced by a seminal paper of Erich Kähler during 30th ‘s of last century. We can easily apply this geometric framework for information metric definition. Let a complex Manifold M n of dimension n, we can associate a Kählerian metric, which can be locally n

[ ]

defined by its definite positive Riemannian form : ds 2 = 2 ∑ g ij .dz i dz j with g ij i , j =1

i, j

an Hermitian definite

∂ 2Φ . ∂z i ∂z j Classical tensors of Riemannian Geometry can be also extended by the following expressions : n n n ∂ 2 g ij ∂g iq ∂g pj ∂g i l k kl k k l ∂g li pq and , Γ ij = ∑ g Γ = g R = − + g ∑ ∑ ij i j kl j j ∂z ∂z k ∂z l ∂z ∂z k ∂z l p ,q =1 l =1 l =1 ∂ 2 log(det g kl ) Main relation, given by Erich Kähler, is that Ricci tensor can be expressed by : Rij = − with the ∂z i ∂z j positive matrix. Kähler assumption assumes that we can define a Kähler potential Φ , such that g ij =

associated scalar curvature R =

n

∑g

k ,l =1

kl

.Rkl . One important geometric flow, in physic & mathematic, is the

MaxEnt’06, Paris , July 2006

∂g ij

1 Rg ij . This flow converges to a ∂t n ∂ 2 log(det g kl ) ∂ 2Φ Kähler-Einstein metric Rij = k 0 .g kl , which is also equivalent to : − = k 0 . i j , known as ∂z i ∂z j ∂z ∂z

Kähler-Ricci flow which drive the evolution of the metric by :

= − Rij +

2

Monge-Ampère equation : det( g kl ) = ψ e − k0Φ where Φ is a Kähler potential and ψ a non specified holomorphic function, but that could be reduced to unity : if k 0 ≠ 0 ) by choice of a new Φ potential, or if k0 = 0 by local holomorphic coordinates selection so that volume det( g kl ) is reduced to 1 (cancellation of Ricci tensor is existence condition of this coordinates system). In case of Complex Auto-Regressive (CAR) models, if we choose as Kähler potential Φ with ∂ 2Φ g ij = i j , the Entropy of the process expressed according to PARCOR coefficients (reflexion ∂z ∂z coefficients) in the unit Poincaré Polydisk {z / z k < 1 ∀k = 1,...n}, the Kähler potential is given by :

[

n −1

Φ = ∑ ρ k . ln 1 − z k k =1

2

] = ln K

n −1

(

)

2 ρk

D ( z , z ) , with Bergman kernel K D ( z , z ) = ∏ 1 − z k

k =1

Very surprisingly, this case was the first example of potential studied by Erich Kähler in his seminal paper, named by Erich Kähler Hyper-Abelian case, relatively to the other case studied by him as Hypern n    2 2 Fuchsian Case Φ = ρ. ln1 − ∑ z k  in unit hyper-ball  z / ∑ z k < 1 .  k =1   k =1 

This choice of Kähler potential as Entropy of CAR model, can be justified by remarking that Entropy Hessian along one direction in the tangent plane of parametric manifold is a definite positive form that can be n ∂ 2 H ( Pθ ) considered as a Kählerian differential metric : g ij( H ) (θ ) = − ⇒ ds H2 = ∑ g ijH (θ ).dθ i .dθ j ∂θ i .∂θ j i , j =1

It is proved by considering the following γ − entropy : H γ ( p) = ∫ γ [ p( x)].dx

d Hγ ( p + tf ) t =0 = − ∫ γ ' [ p( x)]. f ( x).dx dt and the second derivative in the direction g : d 2 H γ ( p : f , g ) = − ∫ γ ' ' [ p( x)]. f ( x).g ( x).dx If we derive H in the direction f, then : dH γ ( p : f ) =

Hessian is then given by : ∆ f H γ ( p) = d 2 H γ ( p : f , f ) = − ∫ γ ' ' [ p( x)]. f ( x) 2 .dx Let FΩ

= {p(. / θ ) ∈ P

γ

: θ ∈ Ω} Manifold in parametric space and

∂p (. / θ ) .dθ i ∂θ i i =1 n

dp = dp (θ ) = ∑

We can develop Hessian relation by : ∆ γ H γ ( p ) = d 2 {H γ ( p)}(θ ) = − ∫ γ ' ' ( p)[dp ] .dx 2

As long as γ is convex, we find the final result ds γ2 = −∆ γ H γ ( p ) with : dsγ2 =

n

∑g γ

i , j =1

( ) ij

(θ ).dθ i .dθ j avec g ij(γ ) (θ ) = ∫ γ ' ' ( p).

(

)

∂p ∂p . .dx ∂θ i ∂θ j

γ α ( x) = (α − 1)−1 . x α − x α ≠ 0 ∂ ln p ∂ ln p By choosing :  , we have : g ij(α ) (θ ) = ∫ p α . . .dθ i .dθ j ∂θ i ∂θ j α =0  γ α ( x) = x. ln( x) So, in case of Complex Autoregressive models, with as previously Whishart density, Entropy is given by : H n = − ∫ P(X n /mn ,Rn ).ln[P(X n /mn ,Rn )].dX n = ln Rn + n. ln(π .e) . If we use the blocks structure of covariance

a V+ −1 + matrix in case of CAR models and relation : if G =   then G = a . B − a .W .V , we obtain the W B  Entropy expressed according to PARCOR coefficients :

MaxEnt’06, Paris , July 2006

1 n  −1 2 P xk α = = ∑ 0 2  0 −1 n k =1 H n = ∑ (k − n). ln 1 − µ k + n. ln π .e.α 0 with  k =1  X = [x L x ]T 1 n  n The Kähler metric is then given by Hessien of Entropy, where Entropy is considered as Kähler potential. n −1

Let θ ( n ) = [P0

µ1

[

]

[

]

µ n−1 ]T = [θ 1( n ) L θ n( n ) ] , we have then g11 = nP0−2 , g ij = T

(n − i ).δ ij

(1 − µ )

2 2

i

From which, we deduce the final expression of Kähler metric of this Hyper-Abelian Case : 2

dµ i  dP  n−1 ds = n. 0  + ∑ (n − i) i =1  P0  1 − µi 2 n

(

2

)

2 2

4. Information Kähler-Ricci & Calabi Flows for Complex Autoregressive models First of Intrinsic Geometric Flow is Ricci Flow. Historical Root of Ricci flow can be found in Hilbert work on General Relativity, where the Minimal Action Principal is defined with tool of Calculus of variations. Einstein Equation was derived by Hilbert from Functional Minimisation, where the “Hilbert Action” S is defined as the integral of scalar Curvature R on the Manifold Mn : S ( g ) = ∫ R. det( g ) .d n x = ∫ R.dη with volume V ( g ) =

∫ dη

M

n

and R =

[ ]

g ∑∑ µ ν

Mn

µν

Mn

.Rµν the scalar curvature defined by mean of Ricci Tensor R µν and

the metric tensor g −1 = g ij . Fundamental theorem of Hilbert said that for n ≥ 3 , S (g ) is minimal with 1 V ( g ) = cste if R(g ) is constant and g is an Einstein metric : Rij = R.g ij . So the more natural geometric n flow that converges to Einstein metric is given by : ∂g ij 1   = 2− Rij + Rg ij  , but unfortunatelly this flow exhibits some convengence problems in finite time. That ∂t n   the main raison why R. Hamilton has introduced the following Ricci flow :     ∂gij 1   = 2− Rij + r.gij  avec r =  ∫ Rdη  / ∫ dη  where R has been replaced by r the mean scalar curvature  n   n  n ∂t   M  M  ∂g ij on the Manifold or equivalently this one after coordinate change = −2.Rij ∀i, j . This flow can be ∂t interpreted as Fourrier Heat Operator acting on metric g, by using isothermal coordinates introduced by G. Darmois. In such local isothermal coordinates system {x i }i =1, 2 , following relations are

  2 k k for k = 1,2 , with : ∆ g f = ∑ g λµ  ∂ λµ f − ∑ Γλµ ∂ k f  Laplace-Beltrami cancelled : F k = ∆ g x i = −∑ g λµ Γλµ λ ,µ λ,µ k   operator. More specifically A. Lichnerowicz has proved that we can expressed Ricci tensor as Rij = −Gij − Lij  1 1 2 g λµ ∂ λµ g ij + H ij and Lij =  ∑ g iµ ∂ j F µ + ∑ g jµ ∂ i F µ  ∑ 2 λµ 2 µ µ  2 ∂ g Then in isothermal coordinates, we have : − 2.Rij = ∑∑ g kl . k ij l + Qij (g −1 , ∂g ) . So, Ricci Flow can be ∂x ∂x k l ∂g ij = ∆ g g ij . interpreted as a Diffusion Equation action on metric g : ∂t

with Gij =

MaxEnt’06, Paris , July 2006

This geometric flow has been extended in the framework of Kählerian Geometry with the Kähler-Ricci Flow : ∂g ij = − Rij ∀i, j . If we use definition of metric g as previously, as derived from Entropic Kähler potential in ∂t  ∂ 2 log det g ij  case of a Complex Autoregressive models, then we can express Ricci tensor : Rkl k ,l = −  : z z ∂ ∂ k l   k ,l

( )

[ ]

(

Rkl = −2δ kl 1 − µ k

)

2 −2

for k = 2,..., n − 1 and R11 = −2 P0− 2

Identically, we can deduce scalar curvature R = ∑ g kl .Rkl of CAR models : k ,l

n −1    n −1  R = −2.n −1 + ∑ (n − j ) −1  = −2.∑ (n − j ) −1  (this curvature diverges when n tends to infinity). j =1    j =0  We can observe that we have an Einstein metric but more generally defined as : Rij = B ( n ) g ij with R = Tr B ( n ) where B ( n ) = −2diag .., (n − i )−1 ,..

[ ]

{

[ ]

[ ]

}

If we exame Kähler-Ricci flow acting on PARCOR parameters, we have :

(

∂ ln 1 − µ i ∂t

2

)= −

1 and (n − i)

∂ ln P0 1 = . From which, we obtain the behaviour of PARCOR coefficient in asymptotic case : ∂t n

(

2

2

)

µ i (t ) = 1 − 1 − µ i (0) .e



t ( n −i )

2

t → µ i (t ) = 1 that converges to unit circle. →∞

We can introduce another Intrinsic Geometric Flow, which has been introduced by Calabi in the framework of Kähler geometry. This flow is related to the notion of “extreme” metrics. This metrics are obtained by a Calabi flow and are deduced from a new functional that depends on the square of the scalar curvature and no longer the scalar curvature itself : Θ( g ) = ∫ R 2 dη . According to Schwarz inequality, Kähler-Einstein metric M

is also an extreme metric that minimises Θ(g ) . Solution is defined as steady state of the PDE equation : ∂ψ ∂ 2ψ = Rψ − R with ψ Kähler potential associated to the Kähler metric g ij = i j , Rψ scalar curvature ∂t ∂z ∂z and R its mean value on the Manifold. This can be used to defined shortest path between two parametric models. Indeed, if we consider the path ψ (t ) (0 ≤ t < 1) and if we assume the existence of Kähler potential {φ (s, t ) : 0 ≤ t < 1} driven by Calabi Flow : ∂φ = R (φ ) − R and φ (0, t ) = ψ (t ) . ∂t 1

2   ∂φ ( s, t )  2  . d η Length of path L(s ) is then given by : L( s ) = ∫  ∫  φ ( s ,t )  .dt .   ∂t  0 V  If we apply same approach as previously for CAR models, Calabi flow will act on Entropy − H ( p) defined as 1

[

∂ ln 1 − µ k

2

] − n. ∂ ln[π .e.P ] = −2.∑

1 1 +   ∂t ∂t k =1  k =1 n − k n  We then deduce the asymptotic behaviour of PARCOR coefficients submitted to Calabi Flow : n −1

Kähler potential : − ∑ (k − n).

(

∂ ln 1 − µ k

2

)=−

0

n

2 2 ∂ ln P0 with = 2 2 n ∂t (n − k ) ∂t We can observe, as previsouly, that Calabi flow will drive PARCOR coefficients evolutions to unit circle.

MaxEnt’06, Paris , July 2006

5. Références In Final paper, we will illustrate the use of these metrics and intrinsic geometric flows to define new robuste distance between planar shapes that are parameterised by complex Auto-Regressive models. This can be used for Planar Shape Recognition or classification

6. Références [Alpay] D. Alpay, « Algorithme de Schur, espaces à noyau reproduisant et théorie des systèmes », Panoramas et synthèse, n°6, Société Mathématique de France, 1998 [Bakas] Ionnis Bakas, “The Algebraic Structure of Geometric Flows in Two Dimensions”, Institute of Physics, SISSA, October 2005 [Besson] G. Besson, “Une nouvelle approche de l’étude de la topologie des varieties de dimension 3 d’après R. Hamilton et G. Perelman », Séminaire Bourbaki, 57ème année, 2004-2005, n°947, Juin 2005 [Bessieres] L. Bessieres, « Conjecture de Poincaré : la preuve de R. Hamilton et G. Perelman », Gazette des Mathématiciens, Octobre 2005 [Barbaresco1] F. Barbaresco, « Modèles Autoregressifs : du coefficient de reflexion à la géométrie Riemannienne de l’information » Actes du colloque en l’honneur du professeur B. Picinbono, revue Traitement de Signal, numéro special vol.15, n°6, pp. 553-561, May 1999 [Barbaresco2] F. Barbaresco, « Calcul des variations et analyse spectrale : équations de Fourier et de Burgers pour modèles autorégressifs régularisés », Revue Traitement du Signal, 2001 [Barbaresco3] F. Barbaresco, « Calculus of Variations & Regularized Spectral Estimation », Coll. MAXENT’2000, Gif-sur-Yvettes, France, Jul. 2000, published by American Institut of Physics. [Barbaresco4] F. Barbaresco, « Etude et extension des flots de Ricci, Kähler-Ricci et Calabi dans le cadre du traitement de l’image et de la géométrie de l’information », Conf. Gretsi, Louvain la Neuve, Sept. 2005 [Berger] M. Berger, « Panoramic View of Riemannian Geometry », Springer, 2004 [Bourguignon] J.P. Bourguignon, « The Unabated Vitality of Kählerian Geometry », edited in [Kähler2] [Calabi1] E. Calabi, « Extremal Kähler metrics », Seminar Ann. of Math. Stud., n°102, Princeton University Press, Princeton, N.J., pp.259-290, 1982 [Calabi2] E. Calabi, “Métriques Kählériennes et fibrés holomorphes », Ann. Scient. Ec. Norm. Sup. Paris,4ème série, t.12, p. 269 à 294, 1979 [Calvo1] M. Calvo & J. Oller, « A Distance Between Multivariate Normal Distributions in an Embedding into the Siegel Group », Journal of Multi. Analysis, vol. 35, n°2, pp.223-242, Nov. 1990 [Calvo2] M. Calvo & J.M. Oller, “A Distance between elliptical distributions based in an embedding into the Siegel Group”, Journal of Computational & Applied Mathematics, n°145, pp.319-334, 2002 [Gauduchon] P. Gauduchon, « Calabi’s extremal Kähler metrics : an elementary introduction», Ecole Polytechnique [Hélein] F. Hélein, Postface à “Analyse et géométrie”, Coll. Géométrie au XXème siècle : histoire et horizons, J. Konneiher & al Ed., 2005 [Kähler1] E. Kähler, « Über eine bemerkenswerte Hermitesche Metrik », Abh. Math. Sem. Hamburg Univ., n°9, pp.173-186, 1932 [Kähler2] « Kähler Erich, Mathematical Works », Edited by R. Berndt and O. Riemenschneider, Berlin, Walter de Gruyter, ix, 2003 [Lafferty] Lafferty J. & Lebanon G., “Diffusion Kernels on Statistical Manifolds”, CMU-CS-04-101, work supported by NSF, January 2004 [Lichnerowicz] A. Lichnerowicz, « Sur les équations relativistes de la gravitation », bulletin de la SMF, t.80, pp.237-251, 1952 [Siegel1] Carl Ludwig Siegel, « Symplectic Geometry », Academic Press, New-York, 1964 [Siegel2] Carl Ludwig Siegel, « Topics in Complex Function Theory », Wiley Interscience Ed., 1969 [Weil] A. Weil, “Introduction à l’étude des variétés Kählériennes », Paris, Hermann, Act. Sc. Ind., Vol. 1267, p.30-43 et p.83-102, 1958 [-] International Conference “Geometric Flows : Theory & Computation”, IPAM, UCLA, USA, February 3-4 , 2004