Consistency of statistics in infinite dimensional ... - Loic Devilliers

Nov 20, 2017 - Reviewer. Charles Bouveyron ... Reviewer. Xavier Pennec ..... Law of large numbers for the sets of (empirical) Fréchet means. Y, (Yn)n i.i.d ...
810KB taille 14 téléchargements 235 vues
Consistency of statistics in infinite dimensional quotient spaces PHD defence of Loïc Devilliers, November, 20, 2017 Prepared at Inria Univeristé Côte d’Azur, CMAP École Polytechnique & ENS Paris-Saclay Jury: Stéphanie Allassonnière Marc Arnaudon Charles Bouveyron Stephan Huckemann Xavier Pennec Stefan Sommer Alain Trouvé

Professor Professor Professor Professor Senior Researcher Associated Professor Professor

Université Paris Descartes Université de Bordeaux Université Côte d’Azur University of Göttingen Université Côte d’Azur, Inria University of Copenhagen ENS Paris-Saclay

Co-advisor Reviewer President Reviewer Advisor Reviewer Examiner

1

Computational Anatomy: Heart Template Estimation

t0 : template, one heart, modeling the others through a diffeomorphism φi . Diffeomorphisms = change the shape but not topology. [Mansi 2009] n  1X (tb0 , φˆ1 , . . . , φˆn ) = argmin kt ◦ φi − Patienti k2 + Regularization(φi ) t,φ1 ,...,φn n i=1

2

Computational Anatomy: Brain Template Estimation

[Guimond 1999, Joshi 2004 etc.], Image from [Hamou 2016] n  1X (tb0 , φˆ1 , . . . , φˆn ) = argmin kt ◦ φi − Yi k2 + Regularization(φi ) n t φ1 ,...,φn i=1

Template estimation is a tool to statistically analyze diseases. 3

Template Estimation with Surfaces

[courtesy of Pierre Roussillon] n

 1X (tb0 , φˆ1 , . . . , φˆn ) = argmin kt ◦ φi − Si k2 + Regularization(φi ) n t,φ1 ,...,φn i=1

Goal of this work : study the statistical properties of template estimation.

4

Example: Periodic (discretized) signals Simple example to introduce the Generative Model: In M = Per1 (R, R). 1.5

1

0.5

0

-0.5 0

0.2

0.4

0.6

0.8

1

Template: t0

5

Example: Periodic (discretized) signals Simple example to introduce the Generative Model: In M = Per1 (R, R). 1.5

1

0.5

0

-0.5 0

0.2

0.4

0.6

0.8

1

Transformed template by a translation: t0 ◦ ϕ

Note that for the L2 norm, we have kt0 ◦ ϕk = kt0 k. 5

Example: Periodic (discretized) signals Simple example to introduce the Generative Model: In M = Per1 (R, R). 1.5

1

0.5

0

-0.5 0

0.2

0.4

0.6

0.8

1

Template and deformed template added to noise: t0 ◦ ϕ + 

For instance, Gaussian noise on each point of the discretization grid. Goal: study the statistical properties of the estimator of t0 .

5

Generative model A group G acts on an ambient space M: for g ∈ G, m ∈ M, Observable variable:

g · m = gm ∈ M.

forward model

Y = Φ · t0 + σ or Y = Φ · (t0 + σ)

backward model

• Φ a random variable in G. • t0 the template in M. • σ > 0 the noise level. •  a standardized noise in M: E() = 0, E(kk2 ) = 1. • Φ and  are independent.

6

Generative model A group G acts on an ambient space M: for g ∈ G, m ∈ M, Observable variable:

g · m = gm ∈ M.

forward model

Y = Φ · t0 + σ or Y = Φ · (t0 + σ)

backward model

• Φ a random variable in G. • t0 the template in M. • σ > 0 the noise level. •  a standardized noise in M: E() = 0, E(kk2 ) = 1. • Φ and  are independent. Inverse problem Given the observed variable Y, how can we estimate the template t0 ? 6

Minimization (max-max algorithm) Estimation by minimizing the variance? The variance at m ∈ M:

 F(m) = E inf km − g · Yk + Regularization(g) 2

g∈G

The empirical variance at m ∈ M for an n-sample Y1 , . . . , Yn : ! n 1X 2 km − gi · Yi k Fn (m) = inf g1 ,...,gn ∈G n i=1

7

Minimization (max-max algorithm) Estimation by minimizing the variance? The variance at m ∈ M:

 F(m) = E inf km − g · Yk + Regularization(g) 2

g∈G

The empirical variance at m ∈ M for an n-sample Y1 , . . . , Yn : ! n 1X 2 km − gi · Yi k Fn (m) = inf g1 ,...,gn ∈G n i=1

Max-max algorithm (also known as Coordinate Descent, GPA, etc.) Alternatively minimization (over these two steps): • Step 1: gi ← registration of Yi to m, for all i. n P • Step 2: m ← 1n gi · Yi i=1

7

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 1th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 2th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 3th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 4th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 5th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 10th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 50th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Example of a failure of max-max algorithm On the previous example of translated functions: sample of size 105 of discretized functions with 64 points, σ = 10 [Allassonnière 2007]. Starting point: the template itself. 2

template current point at the 79th iteration

1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Convergence to a local minimum without approximation.

8

Template estimation with different sample sizes Starting point: random point 2

template max-max ouput

sample size: 2e+05 1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Inconsistency of the estimator? 9

Template estimation with different sample sizes Starting point: random point 2

template max-max ouput

sample size: 4e+05 1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Inconsistency of the estimator? 9

Template estimation with different sample sizes Starting point: random point 2

template max-max ouput

sample size: 6e+05 1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Inconsistency of the estimator? 9

Template estimation with different sample sizes Starting point: random point 2

template max-max ouput

sample size: 8e+05 1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Inconsistency of the estimator? 9

Template estimation with different sample sizes Starting point: random point 2

template max-max ouput

sample size: 1e+06 1.5

1

0.5

0

-0.5

-1

0

0.2

0.4

0.6

0.8

1

Inconsistency of the estimator? 9

Previous works and contributions

Previous works on consistency: • [Kent & Mardia 1995], [Le 1998] and others restricted to simple transformations such as rotation, translation, sometimes scaling: • Consistency with scaling (modification of the algorithm: Y ← • Inconsistency without scaling.

Y ). kYk

• [Huckemann 2012] Template and estimated template lie on different strata for general action in finite dimensional manifold. • [Miolane 2017] Consistency Bias = σ 2 2C + o(σ) as σ → 0 in finite dimensional manifold for Gaussian noise.

10

Previous works and contributions

Previous works on consistency: • [Kent & Mardia 1995], [Le 1998] and others restricted to simple transformations such as rotation, translation, sometimes scaling: • Consistency with scaling (modification of the algorithm: Y ← • Inconsistency without scaling.

Y ). kYk

• [Huckemann 2012] Template and estimated template lie on different strata for general action in finite dimensional manifold. • [Miolane 2017] Consistency Bias = σ 2 2C + o(σ) as σ → 0 in finite dimensional manifold for Gaussian noise. Goal of this Phd work: proving and quantifying this inconsistency, in infinite dimensional spaces.

10

Different hypotheses for the action

General Action + Regularization Term General Action Invariant Distance: dM (g · x, g · y) = dM (x, y) Isometric Action: kg · xk = kxk

What we Want for Application

Part I

Part II The most restrictive hypothesis = the smallest rectangle

11

Table of contents

Introduction Part I: Inconsistency for Isometric Action Part II: Inconsistency for Non isometric Action Conclusion

12

Table of Contents

Introduction Part I: Inconsistency for Isometric Action a) Interpretation of the Max-Max Algorithm with the Fréchet Mean in Quotient Spaces b) Proving the Inconsistency for Isometric Action c) Quantification of Consistency Bias for Isometric Action Part II: Inconsistency for Non isometric Action Conclusion

13

Definitions Definition of Quotient Space Orbit of m ∈ M = set of all the points reachable from m: [m] = {g · m,

g ∈ G}.

Quotient space = set of all orbits: Q = M/G = {[m], m ∈ M}.

14

Definitions Definition of Quotient Space Orbit of m ∈ M = set of all the points reachable from m: [m] = {g · m,

g ∈ G}.

Quotient space = set of all orbits: Q = M/G = {[m], m ∈ M}. Definition of Invariant Distance dM (m, m0 ) = dM (g · m, g · m0 ).

14

Definitions Definition of Quotient Space Orbit of m ∈ M = set of all the points reachable from m: [m] = {g · m,

g ∈ G}.

Quotient space = set of all orbits: Q = M/G = {[m], m ∈ M}. Definition of Invariant Distance dM (m, m0 ) = dM (g · m, g · m0 ). Particular case of Invariant Distance: Isometric Action in Hilbert Space Isometric Action: M a Hilbert,

m 7→ g · m linear,

kg · mk = kmk.

Proof: kg · m − g · m0 k = kg · (m − m0 )k = km − m0 k.

14

Definitions Definition of Quotient Space Orbit of m ∈ M = set of all the points reachable from m: [m] = {g · m,

g ∈ G}.

Quotient space = set of all orbits: Q = M/G = {[m], m ∈ M}. Definition of Invariant Distance dM (m, m0 ) = dM (g · m, g · m0 ). Particular case of Invariant Distance: Isometric Action in Hilbert Space Isometric Action: M a Hilbert,

m 7→ g · m linear,

kg · mk = kmk.

Proof: kg · m − g · m0 k = kg · (m − m0 )k = km − m0 k.

Classical Proposition: Quotient space = Metric Space dM invariant

quotient distance:

dQ ([m], [n]) = inf dM (m, g · n). g∈G

In fact, dQ = pseudo-distance. 14

Fréchet Mean in Metric Spaces

Definition of Fréchet mean in metric spaces Fréchet Mean of Z a random variable in a metric space (X , dX ): FM(Z) = argmin E(d2X (m, Z)) m∈X

Empirical Fréchet Mean of a n-sample Z1 , . . . , Zn : n

EFM(Z1 , . . . , Zn ) = argmin m∈X

1X 2 dX (m, Zi ) n i=1

15

Fréchet Mean in Metric Spaces

Definition of Fréchet mean in metric spaces Fréchet Mean of Z a random variable in a metric space (X , dX ): FM(Z) = argmin E(d2X (m, Z)) m∈X

Empirical Fréchet Mean of a n-sample Z1 , . . . , Zn : n

EFM(Z1 , . . . , Zn ) = argmin m∈X

1X 2 dX (m, Zi ) n i=1

Example of Hilbert spaces: For a Hilbert (M, k

k): FM(Z) = E(Z).

15

Consistency of Estimation

Fn (m) =

n

n

i=1

i=1

1X 1X 2 inf km − gi · Yi k2 = dQ ([m], [Yi ]) gi ∈G n n

Minimizing Empirical Variance = Empirical Fréchet Mean (EFM) in Q

16

Consistency of Estimation

Fn (m) =

n

n

i=1

i=1

1X 1X 2 inf km − gi · Yi k2 = dQ ([m], [Yi ]) gi ∈G n n

Minimizing Empirical Variance = Empirical Fréchet Mean (EFM) in Q Law of large numbers for the sets of (empirical) Fréchet means Y, (Yn )n i.i.d variables. Thanks to [Ziezold 1977] (if Q is separable): lim EFM([Y1 ], . . . , [Yn ]) ⊂ FM([Y])

n→+∞

a.s.

16

Consistency of Estimation

Fn (m) =

n

n

i=1

i=1

1X 1X 2 inf km − gi · Yi k2 = dQ ([m], [Yi ]) gi ∈G n n

Minimizing Empirical Variance = Empirical Fréchet Mean (EFM) in Q Law of large numbers for the sets of (empirical) Fréchet means Y, (Yn )n i.i.d variables. Thanks to [Ziezold 1977] (if Q is separable): lim EFM([Y1 ], . . . , [Yn ]) ⊂ FM([Y])

n→+∞

[t0 ] not a Fréchet mean of [Y]

a.s.

Inconsistency.

Definition of consistency bias Consistency bias (CB): distance between [t0 ] and FM([Y]). 16

Simple example: the action of rotation Considering SO(n) acting on Rn by rotation. m • Y•

• 0

Q ' R+ dQ ([m], [Y])

Two orbits (circles), the quotient space (R+ ), and the distance between orbits

17

Simple example: the action of rotation Considering SO(n) acting on Rn by rotation. m • Y•

• 0

Q ' R+ dQ ([m], [Y])

Two orbits (circles), the quotient space (R+ ), and the distance between orbits

F(m) = E((kYk − kmk)2 ), Fréchet mean: km? k = E(kYk). Y = Φ · (t0 + σ) km? k = E(kt0 + σk)> kt0 k (in general). inconsistency, + Consistency bias computed [Miolane 2017]. Example too simple: infima are removed, not always possible. 17

Why isometric action is simple?

Our first result of consistency only for isometric action. Isometric action simplification of the square quotient distance: dQ ([a], [b])2 = inf ka − g · bk2 = kak2 + inf (−2 ha, g · bi + kg · bk2 ) g∈ G

g∈ G

18

Why isometric action is simple?

Our first result of consistency only for isometric action. Isometric action simplification of the square quotient distance: dQ ([a], [b])2 = inf ka − g · bk2 = kak2 + inf (−2 ha, g · bi + kg · bk2 ) g∈ G

g∈ G

2

= kak + kbk2 + inf (−2 ha, g · bi) g∈G

Useful equality for the proof and the quantification of the consistency.

18

Table of Contents

Introduction Part I: Inconsistency for Isometric Action a) Interpretation of the Max-Max Algorithm with the Fréchet Mean in Quotient Spaces b) Proving the Inconsistency for Isometric Action c) Quantification of Consistency Bias for Isometric Action Part II: Inconsistency for Non isometric Action Conclusion

19

Inconsistency for isometric action gt0

0

Cone(t0 )

t0

g0 t0 Cone of the template (in gray), and support of t0 + σ (dotted disk).

Theorem: Inconsistency for isometric action in Hilbert space Observable variable: Y = Φ · (t0 + σ). If: P(t0 + σ ∈ / Cone(t0 )) > 0 Then [t0 ] is not a Fréchet mean of [Y]

Inconsistency. 20

Sketch of the proof (finite group = more visual proof) For G finite, R(X) registration of X = t0 + σ to t0 . Gradient of the variance: ∇F(t0 ) = 2 (E(X) − E(R(X)))

gt0

gt0

Cone(t0 ) •

0

t0

0 • g2 X

g0 t0

Cone(t0 )

X •

g3 X

t0 • g1 X

g0 t0 E(X) = t0

Points in green = Orbit of X.

21

Sketch of the proof (finite group = more visual proof) For G finite, R(X) registration of X = t0 + σ to t0 . Gradient of the variance: ∇F(t0 ) = 2 (E(X) − E(R(X)))

gt0

0

Cone(t0 )

t0

gt0

Cone(t0 )

X • 0

t0 • R(X)

g0 t0

g0 t0 E(X) = t0

R(X): point in the orbit of X in Cone(t0 ).

21

Sketch of the proof (finite group = more visual proof) For G finite, R(X) registration of X = t0 + σ to t0 . Gradient of the variance: ∇F(t0 ) = 2 (E(X) − E(R(X)))

gt0

0

Cone(t0 )

t0

gt0 X • 0

Cone(t0 ) • ˜ X t0 • R(X)

g0 t0

g0 t0 E(X) = t0

˜ ∈ Cone(t0 ), then R(X) ˜ = X. ˜ X

21

Sketch of the proof (finite group = more visual proof) For G finite, R(X) registration of X = t0 + σ to t0 . Gradient of the variance: ∇F(t0 ) = 2 (E(X) − E(R(X)))

gt0

0

Cone(t0 )

gt0

0

t0

g0 t0

Cone(t0 )

t0

Z

g0 t0 Graphic representation of Z = E(R(X)). The part in grid-line = folded points.

E(X) = t0

∇F(t0 ) 6= 0

Inconsistency

21

Sketch of the proof (finite or infinite group)

When the group is not finite, differentiate the variance. Two possible methods to show inconsistency: • Find argmin F, and see if t0 ∈ argmin F : difficult issue. • Find a point x such has F(x) < F(t0 ): We found a point λt0 with F(λt0 ) < F(t0 ) Inconsistent. Be careful, a priori [λt0 ] is not a Fréchet mean of [Y].

22

How often is fulfilled this condition with the cone? A group G acts isometrically on a Hilbert space. [t0 ] a manifold, Tt0 [t0 ] the affine tangent space of [t0 ] at t0 . Tt0 [t0 ]⊥ the normal space of [t0 ] at t0 . Proposition: being inconsistent for smooth orbits. P( ∈ / Tt0 [t0 ]⊥ ) > 0 =⇒ inconsistency

Tt0 [t0 ]

[t0 ]

y 0

g · t0

Tt0 [t0 ]⊥ t0

y∈ / Tt0 [t0 ]⊥ therefore y is closer from g · t0 for some g ∈ G than t0 itself. In conclusion, y in the support of X = t0 + σ inconsistency. 23

Table of Contents

Introduction Part I: Inconsistency for Isometric Action a) Interpretation of the Max-Max Algorithm with the Fréchet Mean in Quotient Spaces b) Proving the Inconsistency for Isometric Action c) Quantification of Consistency Bias for Isometric Action Part II: Inconsistency for Non isometric Action Conclusion

24

Consistency bias when the noise level tends to infinity Definition of consistency bias Consistency bias (CB) : distance between the template t0 and argmin F. Definition of fixed points A fixed point m ∈ M : for all g ∈ G, g · m = m.

25

Consistency bias when the noise level tends to infinity Definition of consistency bias Consistency bias (CB) : distance between the template t0 and argmin F. Definition of fixed points A fixed point m ∈ M : for all g ∈ G, g · m = m. Proposition: consistency bias is asymptotically linear when σ → +∞ G acts isometrically on a Hilbert space. We take Y = Φ · t0 + σ. If support of the noise  is not included in the set of fixed points then: ! CB = σK + o(σ) as σ → +∞, where K = sup E sup hv, g · i > 0. kvk=1

g∈G

Moreover, lim CB = σK. t0 →0

25

Sketch of the proof   F(m) = E inf km − g · Yk2 where Y =Φ · t0 +σ. g∈G

• Minimization of F(λv) w.r.t. λ ≥ 0, kvk = 1. Then m? ∈ argmin F ! km? k = sup E sup hv, g · Yi kvk=1

g∈G

26

Sketch of the proof   F(m) = E inf km − g · Yk2 where Y =Φ · t0 +σ. g∈G

• Minimization of F(λv) w.r.t. λ ≥ 0, kvk = 1. Then m? ∈ argmin F ! km? k = sup E sup hv, g · Yi kvk=1

g∈G

!

= sup E sup (hv, gΦt0 i + hv, σgi) kvk=1

g∈G

Difficult (impossible?) to compute.

26

Sketch of the proof   F(m) = E inf km − g · Yk2 where Y =Φ · t0 +σ. g∈G

• Minimization of F(λv) w.r.t. λ ≥ 0, kvk = 1. Then m? ∈ argmin F ! km? k = sup E sup hv, g · Yi kvk=1

g∈G

!

= sup E sup (hv, gΦt0 i + hv, σgi) kvk=1

g∈G

Difficult (impossible?) to compute. • Cauchy-Schwarz inequality: −kt0 k + σK ≤ km? k ≤ kt0 k + σK

26

Sketch of the proof   F(m) = E inf km − g · Yk2 where Y =Φ · t0 +σ. g∈G

• Minimization of F(λv) w.r.t. λ ≥ 0, kvk = 1. Then m? ∈ argmin F ! km? k = sup E sup hv, g · Yi kvk=1

g∈G

!

= sup E sup (hv, gΦt0 i + hv, σgi) kvk=1

g∈G

Difficult (impossible?) to compute. • Cauchy-Schwarz inequality: −kt0 k + σK ≤ km? k ≤ kt0 k + σK • By triangular inequality: −2kt0 k + σK ≤ km? − t0 k ≤ σK + 2kt0 k K > 0 (because the support  is not included in the set of fixed points). 26

Table of Contents

Introduction Part I: Inconsistency for Isometric Action Part II: Inconsistency for Non isometric Action a) Inconsistency for Invariant Distance b) Inconsistency for Non Invariant Distance Conclusion

27

Variation of the Isotropy Group Due to the Noise Definition: Isotropy Group (or Stabilizer) Iso(m) = {g ∈ G, s.t. g · m = m}

28

Variation of the Isotropy Group Due to the Noise Definition: Isotropy Group (or Stabilizer) Iso(m) = {g ∈ G, s.t. g · m = m} Example: Reparametrization of functions ϕ : [0, 1] → [0, 1] homeomorphism, f : [0, 1] → R

(ϕ, f ) 7→ ϕ · f = f ◦ ϕ

t0 constant map on D = [0.2, 0.8]

t0 + noise 0.5

0.5 0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 0

t0+noise

t0

0.2

0.4

0.6

0.8

1

Iso(t0 ) = {ϕ | ϕ|Dc = Id} ! {Id}

0 0

0.2

0.4

0.6

0.8

Iso(t0 + noise) = {Id}

1

28

Stability Theorem Implies Inconsistency Stability Theorem in Hilbert spaces G a compact group acting continuously on M a Hilbert space, dM is invariant. Observable variable Y in M. If P(Iso(Y) = {eG }) > 0 eG : neutral element in G.   m? ∈ argmin F(m) = argmin E inf dM (m, g · Y)2 . m∈M

m∈M

g∈G

If R(Y) is a measurable variable registering Y to m? , then: Iso(m? ) = {eG }. Implies Inconsistency if Iso(t0 ) 6= {eG }. Stability Theorem also true in complete finite dimensional Riemannian manifolds and proof of the measurable variable R(Y) [Huckemann 2012]. 29

Table of Contents

Introduction Part I: Inconsistency for Isometric Action Part II: Inconsistency for Non isometric Action a) Inconsistency for Invariant Distance b) Inconsistency for Non Invariant Distance Conclusion

30

Non Invariant Distance

Non invariant distance used in applications: Reparametrization by a diffeomorphism ϕ fi : Rd → R images d = 2 or signals d = 1: kf1 ◦ ϕ − f2 ◦ ϕk2 6= kf1 − f2 k2 .

31

Non Invariant Distance

Non invariant distance used in applications: Reparametrization by a diffeomorphism ϕ fi : Rd → R images d = 2 or signals d = 1: kf1 ◦ ϕ − f2 ◦ ϕk2 6= kf1 − f2 k2 . G acting on a Hilbert space: A priori, possibility to define a distance in the quotient space. For Y = Φ · t0 + σ. minimizing F(m) = E(inf kY − g · mk2 ): still possible. g∈G

31

How to deal with non isometric action? Isometric

• 0

General Action with Bounded Orbit

t0 •

Orbit of the template, in gray the noise.

σ

• 0

t0 •

σ

Bounded orbit of the template, in gray the noise.

We can find a point λt0 such that F(λt0 ) < F(t0 ). 32

How to deal with non isometric action? Isometric

• 0

General Action with Bounded Orbit

t0 •

σ

Orbit of the template, in gray the noise.

• 0

t0 •

σ

Bounded orbit of the template, in gray the noise.

We can find a point λt0 such that F(λt0 ) < F(t0 ). 32

How to deal with non isometric action? Isometric

• 0

σ

Orbit of the template, in gray the noise.

We can find a point λt0 such that F(λt0 ) < F(t0 ).

General Action with Bounded Orbit

• 0

σ

Bounded orbit of the template, in gray the noise.

So why not in this case? 32

Inconsistency for non invariant distance

Inconsistency: a subgroup of G acts isometrically A group G acting on a Hilbert space, [t0 ] is bounded. We note: ! 1 θ(G) = E sup hg · t0 , i kt0 k g∈G If H a subgroup of G, H acts isometrically and θ(H) > 0, then inconsistency for σ > σc = f([t0 ], θ(G), θ(H), t0 ) for a certain positive function f .

33

Inconsistency for non invariant distance

Inconsistency: a subgroup of G acts isometrically A group G acting on a Hilbert space, [t0 ] is bounded. We note: ! 1 θ(G) = E sup hg · t0 , i kt0 k g∈G If H a subgroup of G, H acts isometrically and θ(H) > 0, then inconsistency for σ > σc = f([t0 ], θ(G), θ(H), t0 ) for a certain positive function f . Example G = group of diffeomorphisms, H = rotations.

33

Inconsistency for non invariant distance Inconsistency for G acting linearly + Regularization A group G acting linearly on a Hilbert space, [t0 ] is bounded. We note: ! 1 θ(G) = E sup hg · t0 , i . kt0 k g∈G The template estimation is performed by minimizing   2 F(m) = E inf kg · m − Yk + Regularization(g) , g∈G

where Regularization is bounded. If θ(G) > 0 then Inconsistency for σ > σc = ˜f([t0 ], θ(G), t0 ) for a certain positive function ˜f.

34

Inconsistency for non invariant distance Inconsistency for G acting linearly + Regularization A group G acting linearly on a Hilbert space, [t0 ] is bounded. We note: ! 1 θ(G) = E sup hg · t0 , i . kt0 k g∈G The template estimation is performed by minimizing   2 F(m) = E inf kg · m − Yk + Regularization(g) , g∈G

where Regularization is bounded. If θ(G) > 0 then Inconsistency for σ > σc = ˜f([t0 ], θ(G), t0 ) for a certain positive function ˜f. Action of reparametrization of functions ϕ a diffeo (ϕ, f) 7→ f ◦ ϕ linear action. Proof: (af1 + f2 ) ◦ ϕ = af1 ◦ ϕ + f2 ◦ ϕ.

34

Table of contents

Introduction Part I: Inconsistency for Isometric Action Part II: Inconsistency for Non isometric Action Conclusion

35

Summary of contributions

• It is proved that the template estimation with the Fréchet mean in quotient space is not consistent for isometric action. • It is possible to quantify the consistency bias for σ → +∞. • We proved a stability theorem which implies the inconsistency in Hilbert Space for invariant distance. • The inconsistency can also be proved for not isometric action, but only for σ high enough.

36

Summary of contributions

• It is proved that the template estimation with the Fréchet mean in quotient space is not consistent for isometric action. • It is possible to quantify the consistency bias for σ → +∞. • We proved a stability theorem which implies the inconsistency in Hilbert Space for invariant distance. • The inconsistency can also be proved for not isometric action, but only for σ high enough. This work has been presented in a workshsop (MFCA 2015), published in a conference (IPMI 2017) and in two journal papers (SIIMS 2017 and Entropy 2017).

36

What are the possible extensions?

• Extending the existence of the measurable variable which registers data to a certain point. • Proving the inconsistency for non invariant distance for all σ. • Provide an asymptotic behaviour of the consistency bias when σ → 0.

37

Thank you for your attention! Any questions?

37

Example 2: action of diffeomorphisms on functions 2

1

0

-1

-2 0

0.2

0.4

0.6

Template: t0

0.8

1

Example 2: action of diffeomorphisms on functions 2

1

0

-1

-2 0

0.2

0.4

0.6

0.8

1

Deformed template: t0 ◦ ϕ ˙ SRVF: The norm of √f ˙ is invariant under the action of ϕ commonly used. |f|

Example 2: action of diffeomorphisms on functions 2

1

0

-1

-2 0

0.2

0.4

0.6

0.8

1

Template and deformed template added to noise: t0 ◦ ϕ + 

Example 3: Consistency and smoothness Example of translated functions: sample size 106 of discretized functions with 64 points, σ = 10. 1.5 template max max output

1

0.5

0

-0.5 0

0.2

0.4

0.6

0.8

1

Example 4: Local minima

1.4

1.2

1

0.4

0.45

0.5

0.55

0.6