Min-max hyperparameter tuning, with application to ... - Julien Marzat

Sep 1, 2011 - optimizer based on the Kriging prediction. Starting point. Performance index y(xc) already computed for an initial sampling of hyperparameter ...
2MB taille 1 téléchargements 304 vues
Min-max hyperparameter tuning, with application to fault detection Julien Marzat, Hélène Piet-Lahanier, Eric Walter 18th IFAC World Congress Milano, Italy, August 28 – September 2, 2011

IFAC WC 2011 - J.Marzat - 01/09/2011 - 1/15

Outline

Problem formulation

Tuning methodology

Robust tuning

Summary and future work

IFAC WC 2011 - J.Marzat - 01/09/2011 - 2/15

Problem formulation Hyperparameters of fault diagnosis methods Observer gains Covariance matrices Thresholds Size of expected change Time horizon

... Optimal tuning of Hyperparameters Search for best performance of a method on a complex problem Required to compare fault detection methods Involves costly simulations of a test case Determination of hyperparameters via global optimization of performance indices IFAC WC 2011 - J.Marzat - 01/09/2011 - 3/15

Simulation of a test case : example

IFAC WC 2011 - J.Marzat - 01/09/2011 - 4/15

Simulation of a test case : example

IFAC WC 2011 - J.Marzat - 01/09/2011 - 4/15

Problem formulation

Tuning viewed as a Computer Experiment Kriging as a surrogate approximation of the complex simulation Efficient Global Optimization, iterative search for the global optimizer based on the Kriging prediction Starting point Performance index y (xc ) already computed for an initial sampling of hyperparameter vectors Xn = [xc,1 , ..., xc,n ], xc ∈ Xc IFAC WC 2011 - J.Marzat - 01/09/2011 - 5/15

Kriginga y (·) modeled as a Gaussian process Y (xc ) = f T b + Z (xc ) where f parametric prior, b to be estimated from available data Z (·) zero-mean Gaussian process with covariance cov (Z (xc ), Z (xc + h)) = σ 2 R (h) (

d X hk 2 Chosen correlation function, e.g., R (h) = exp − θk

)

k=1

What Kriging provides b (xc ), best linear unbiased prediction of y (·) at any xc ∈ Xc Y Variance of the prediction error σ b2 (xc ) [a] G. Matheron : Principles of geostatistics. Economic Geology, 58(8) :1246, 1963.

IFAC WC 2011 - J.Marzat - 01/09/2011 - 6/15

Kriging illustration

IFAC WC 2011 - J.Marzat - 01/09/2011 - 7/15

Efficient Global Optimization (EGO) algorithmb Objective: find iteratively the minimum of y (·) 1

Compute the value of y (·) for an initial sampling

2

Find the empirical minimum in the available data points, ymin

3

Fit a Kriging predictor on those data points

4

Find a new point of interest to evaluate y (·) by maximizing Expected Improvement, given by EI(xc ) = σ b (xc ) [uΦ (u) + φ (u)] where

 u=

5

 b (xc ) ymin − Y σ b (xc )

Go to step 2 until threshold on EI or max nb. of samples

[b] D.R. Jones : A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization, 21(4) :345– 383, 2001. IFAC WC 2011 - J.Marzat - 01/09/2011 - 8/15

Application – tuning of a fault-diagnosis scheme

Luenberger observer (3 hp : poles) + CUSUM (2 hp) → 5 hp Cost function y = rfd + rnd (rfd : false-detection rate, rnd : non-detection rate)

IFAC WC 2011 - J.Marzat - 01/09/2011 - 9/15

A sample of numerical results

Results for 100 runs

mean

std.

False-alarm rate Non-detection rate Number of simulations

0 0.0455 102.21

0.001 0.009 20

IFAC WC 2011 - J.Marzat - 01/09/2011 - 10/15

Robust tuning Need for a robust tuning Results obtained for fixed conditions of the simulation What happens with stronger disturbances, more noise, smaller fault ? →Simulation depends on a set of environmental variables xe

5 hyperparameters + 2 environmental variables

IFAC WC 2011 - J.Marzat - 01/09/2011 - 11/15

Robust tuning – proposed solution Continuous minimax optimization Search for optimal tuning for worst-case environmental variables b xc , b xe = arg min max y (xc , xe ) xc ∈Xc xe ∈Xe

Sketch of proposed algorithm Transform the initial problem into

 

min τ xc,τ

 y (xc , xe ) < τ, ∀xe Iterative relaxationc : 1 2 3 4

Draw a new xe , append it to Re Find a minimum b xc for all explored xe ∈ Re with EGO Find a maximum b xe for b xc with EGO Check convergence, repeat if necessary

[c] derived from K. Shimizu and E. Aiyoshi, Necessary conditions for min-max problems and algorithms by a relaxation procedure, IEEE TAC, 1980 IFAC WC 2011 - J.Marzat - 01/09/2011 - 12/15

Min-max tuning for aircraft fault diagnosis application

Results for 100 runs

mean

std.

False-alarm rate Non-detection rate Number of simulations

0 0.125 168

0.002 0.047 26

IFAC WC 2011 - J.Marzat - 01/09/2011 - 13/15

Estimation of the worst-case

IFAC WC 2011 - J.Marzat - 01/09/2011 - 14/15

Summary and future work Summary Automatic tuning with Kriging and Bayesian Optimization Tuning of complete FDI schemes for dynamical systems : simultaneous adjustment of hyperparameters of residual-generation and residual-evaluation strategies New robust tuning algorithm in the worst-case sense Few runs of the simulation required Generic: applicable to many engineering design problems Future work Consider higher-dimensional problems (in both dimensions) More complex constraints on the cost function

IFAC WC 2011 - J.Marzat - 01/09/2011 - 15/15

Illustration of EGO

Iteration 1

Iteration 2

Iteration 3 IFAC WC 2011 - J.Marzat - 01/09/2011 - 15/15