On the estimation of a parameter with incomplete knowledge on a

Laboratoire des Signaux et Syst`emes. CNRS-ESE-UPS. Supélec, Plateau de Moulon ... Maximum Likelihood (ML). ̂θML = arg max θ. {l0(θ) = fX|ν0,θ(x|ν0,θ)}.
74KB taille 1 téléchargements 261 vues
'

$

On the estimation of a parameter with incomplete knowledge on a nuisance parameter Ali Mohammad-Djafari and Adel Mohammadpour Laboratoire des Signaux et Syst`emes CNRS-ESE-UPS Sup´elec, Plateau de Moulon 91192 Gif-sur-Yvette, FRANCE. [email protected] http://djafari.free.fr &

1

%

'

$

Contents • Problems statement • Classical methods – Maximum likelihood – Bayesian approach • New approach

&

2

%

'

$

Problems statement • We have a set of data:

x 1 , · · · , xn

• We have assigned a parametric probability law to them: fX|V,θ (x|ν, θ)

and

cdf

• There are two set of parameters:

θ

and

pdf

FX|V,θ (x|ν, θ) ν

• We are interested on θ and ν is a nuisance parameter • We may have or not some prior knowledge on θ • We have some prior knowledge on ν and we want to account for it when inferring on θ &

3

%

'

$

Case 1: Perfect knowledge of ν: ν = ν0 • Maximum Likelihood (ML) ª © b θM L = arg max l0 (θ) = fX|ν0 ,θ (x|ν0 , θ) . θ

• Bayesian approach: If we also have a prior fΘ (θ) on the parameter of interest θ, then we can use the Bayesian approach: fΘ|X,ν0 (θ|x, ν0 ) ∝ fX|ν0 ,θ (x|ν0 , θ) fΘ (θ) – Maximum a posteriori (MAP) ª © b θM AP = arg max fΘ|X,ν0 (θ|x, ν0 ) = arg max {l0 (θ) fΘ (θ)} θ

θ

– Mean Square Estimate (MSE) R Z θ l0 (θ) fΘ (θ) dθ θbM SE = E {Θ} = θ fΘ|X,ν0 (θ|x, ν0 ) dθ = R l0 (θ) fΘ (θ) dθ

&

4

%

'

$

Case 2: Incomplete knowledge of ν via fV (ν) or FV (ν) • Marginal Maximum Likelihood (MML) ª © b θM M L = arg max l1 (θ) = fX|θ (x|θ) θ

where

fX|θ (x|θ) =

Z

fX|V,θ (x|ν, θ)fV (ν) dν

• Bayesian approach: If we also have a prior fΘ (θ) ª © b θM M AP = arg max fΘ|X (θ|x) = arg max {l1 (θ) fΘ (θ)} θ

or

&

θbM M SE = E {Θ} =

θ

Z

R θ l1 (θ) fΘ (θ) dθ θ fΘ|X (θ|x) dθ = R l1 (θ) fΘ (θ) dθ 5

%

'

Case 3: Incomplete knowledge of ν through the knowledge of a finite number of its moments

$

• Our prior knowledge on ν is: E {φk (V)} = dk ,

k = 1, · · · , K

where φk are known functions. Particular case: φk (ν) = ν k where {dk } are then the moments of V. • Maximum Entropy (ME): Z maximize H = − fV (ν) ln fV (ν) dν subject to the data constraints E {φk (V)} = to obtain fV (ν). • Do as in previous case. &

Z

φk (V)fV (ν) dν = dk ,

6

k = 1, · · · , K

%

'

$

• Solution:

"

fV (ν) = exp −λ0 −

K X

k=1

λk φk (ν)

#

Lagrange parameters {λk , k = 0, · · · , K} are the solution of: " # Z K X λk φk (ν) dν = dk , k = 1, · · · , K, φk (ν) exp −λ0 − k=1

where we used φ0 (ν) = 1 and d0 = 1.

&

7

%

'

Case 4: Incomplete knowledge of ν through the knowledge of its median

$

• Our prior knowledge on ν is: median(V) = ν0 Z +∞ Z ν0 fV (ν) dν = fV (ν) dν −∞

ν0

or simply Fν (ν0 ) = 1/2. • Maximum Entropy (ME) ? • New approach is motivated by Marginal Maximum Likelihood (MML) ª © b θM M L = arg max l1 (θ) = fX|θ (x|θ) θ

where

fX|θ (x|θ) =

Z

fX|V,θ (x|ν, θ) fV (ν) dν

• Main idea : Define feX|θ (x|θ) as the median of fX|V,θ (x|ν, θ) over fV (ν)

&

8

%

'

$

New inference tool • Marginal probability law: Z ª © fX|θ (x|θ) = fX|V,θ (x|ν, θ) fV (ν) dν = EV fX|V,θ (x|V, θ) or equivalently

FX|θ (x|θ) =

Z

ª

©

FX|V,θ (x|ν, θ) fV (ν) dν = EV FX|V,θ (x|V, θ) ,

can be recognized as the mean value of FX|V,θ (x|V, θ) over the probability law fV (ν) • The proposed new criterion is then defined as the median value of fX|V,θ (x|V, θ) (or FX|V,θ (x|V, θ)) over the probability law fV (ν):

&

³

´

FeX|θ (x|θ) : P FX|V,θ (x|V, θ) ≤ FeX|θ (x|θ) = 1/2 9

%

'

• In previous works (MaxEnt03, SPSP04), we showed that, under some mild conditions on FX|V,θ (x|ν, θ) (strictly increasing) the function FeX|θ (x|θ) has all the properties of a cdf and thus the function ˜l1 (θ) = feX|θ (x|θ) has all the properties of a likelihood function. Thus, we can use it in place of l1 (θ), i.e.: o n θbM LM = arg max ˜l1 (θ) = feX|θ (x|θ)

$

θ

• If we also have a prior fΘ (θ) we can define the a posteriori f˜Θ|X (θ|x) and n n o o θbM AP M = arg max f˜Θ|X (θ|x) = arg max ˜l1 (θ) fΘ (θ) θ

θ

• Indeed, we showed that the expression of FeX|θ (x|θ) is given by ¶ µ 1 where L(ν) = FX|V,θ (x|ν, θ) FeX|θ (x|θ) = L FV−1 ( ) 2

• Thus to obtain the expression of FeX|θ (x|θ) we only need to know the median value FV−1 ( 12 ) of the distribution fV (ν). & 10

%

'

Examples:

Example 1:

fX|V,θ (x|ν, θ) = N (x; θ, ν)

$

• Perfect knowledge ν = ν0 : −→ fX|ν0 ,θ (x|ν0 , θ) = N (x; ν0 , θ) ML estimate of θ is : θb = (x − ν0 )2 .

• Knowledge of fV (ν) =ZN (ν; ν0 , θ0 ): fX|θ (x|θ) =

fX|V,θ (x|ν, θ) fV (ν) dν = N (x; ν0 , θ + θ0 )

MML estimate of θ is: θb = max((x − ν0 )2 − θ0 , 0).

• Moments knowledge case: — E {V} = ν0 and S = R+ −→ ME pdf fV (ν) = E (ν; ν0 ) — E {|V|} = ν0 and S = R −→ ME pdf is: fV (ν) = DE (ν; ν0 ) © ª 2 — E {V} = ν0 and E (V − ν0 ) = θ0 −→ ME pdf is: fV (ν) = N (ν; ν0 , θ0 ).

• Knowledge of the median ν˜ of V: feX|θ (x|θ) = N (x; ν0 , θ) o n which gives: θb = (x − ν0 )2 . θb = arg max feX|θ (x|θ)

&

θ

11

%

'

fX|V,θ (x|ν, θ) = N (x; ν, θ) ª © b • Perfect knowledge ν = ν0 : −→ θ = arg maxθ fX|ν0 ,θ (x|ν0 , θ) ML estimate of θ is : θb = x.

Examples:

Example 2:

$

• Knowledge of fV (ν) = fV (ν) = IG (ν; α/2, β/2):

fX|θ (x|θ) = N (x; θ, ν) IG (ν; α/2, β/2) dν = S (x; θ, α/β, α)

MML estimate of θ is: θb = x.

• Moments knowledge case: E {V} = ν0 : V here is the variance, then S = R+ −→ ME pdf fV (ν) = E (ν; ν0 ) fX|θ (x|θ) = N (x; θ, ν) E (ν; ν0 ) dν = S (x; θ, 0, 1) = C (x; θ, 1) MML estimate of θ is: θb = x.

• Knowledge of the median ν0 of V −→ : feX|θ (x|θ) = N (x; θ, ν0 ) o n θb = arg maxθ feX|θ (x|θ) which gives θb = x.

• Note: in all cases θb = x (scale invariance). & 12

%

'

$

Conclusions • We considered the problem of estimating one of the two parameters of a probability distribution when the other one is considered as a nuisance parameter on which we may have some prior information. • Complete knowledge ν = ν0 . This is the simplest case and the classical likelihood based methods apply. • Incomplete knowledge via fV (ν). Integrate out the nuisance parameter and obtain a marginal likelihood and use it. • Incomplete knowledge via a finite number of its moments. Use the ME principle to translate our prior knowledge into a prior pdf fV (ν) and find the situation of the previous case. • Incomplete knowledge via the median value of the nuisance parameter. A new inference tool is presented which can handle this case.

&

13

%