Min-max hyperparameter tuning, with application to fault detection Julien Marzat, Hélène Piet-Lahanier, Eric Walter 18th IFAC World Congress Milano, Italy, August 28 – September 2, 2011
IFAC WC 2011 - J.Marzat - 01/09/2011 - 1/15
Outline
Problem formulation
Tuning methodology
Robust tuning
Summary and future work
IFAC WC 2011 - J.Marzat - 01/09/2011 - 2/15
Problem formulation Hyperparameters of fault diagnosis methods Observer gains Covariance matrices Thresholds Size of expected change Time horizon
... Optimal tuning of Hyperparameters Search for best performance of a method on a complex problem Required to compare fault detection methods Involves costly simulations of a test case Determination of hyperparameters via global optimization of performance indices IFAC WC 2011 - J.Marzat - 01/09/2011 - 3/15
Simulation of a test case : example
IFAC WC 2011 - J.Marzat - 01/09/2011 - 4/15
Simulation of a test case : example
IFAC WC 2011 - J.Marzat - 01/09/2011 - 4/15
Problem formulation
Tuning viewed as a Computer Experiment Kriging as a surrogate approximation of the complex simulation Efficient Global Optimization, iterative search for the global optimizer based on the Kriging prediction Starting point Performance index y (xc ) already computed for an initial sampling of hyperparameter vectors Xn = [xc,1 , ..., xc,n ], xc ∈ Xc IFAC WC 2011 - J.Marzat - 01/09/2011 - 5/15
Kriginga y (·) modeled as a Gaussian process Y (xc ) = f T b + Z (xc ) where f parametric prior, b to be estimated from available data Z (·) zero-mean Gaussian process with covariance cov (Z (xc ), Z (xc + h)) = σ 2 R (h) (
d X hk 2 Chosen correlation function, e.g., R (h) = exp − θk
)
k=1
What Kriging provides b (xc ), best linear unbiased prediction of y (·) at any xc ∈ Xc Y Variance of the prediction error σ b2 (xc ) [a] G. Matheron : Principles of geostatistics. Economic Geology, 58(8) :1246, 1963.
IFAC WC 2011 - J.Marzat - 01/09/2011 - 6/15
Kriging illustration
IFAC WC 2011 - J.Marzat - 01/09/2011 - 7/15
Efficient Global Optimization (EGO) algorithmb Objective: find iteratively the minimum of y (·) 1
Compute the value of y (·) for an initial sampling
2
Find the empirical minimum in the available data points, ymin
3
Fit a Kriging predictor on those data points
4
Find a new point of interest to evaluate y (·) by maximizing Expected Improvement, given by EI(xc ) = σ b (xc ) [uΦ (u) + φ (u)] where
u=
5
b (xc ) ymin − Y σ b (xc )
Go to step 2 until threshold on EI or max nb. of samples
[b] D.R. Jones : A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization, 21(4) :345– 383, 2001. IFAC WC 2011 - J.Marzat - 01/09/2011 - 8/15
Application – tuning of a fault-diagnosis scheme
Luenberger observer (3 hp : poles) + CUSUM (2 hp) → 5 hp Cost function y = rfd + rnd (rfd : false-detection rate, rnd : non-detection rate)
IFAC WC 2011 - J.Marzat - 01/09/2011 - 9/15
A sample of numerical results
Results for 100 runs
mean
std.
False-alarm rate Non-detection rate Number of simulations
0 0.0455 102.21
0.001 0.009 20
IFAC WC 2011 - J.Marzat - 01/09/2011 - 10/15
Robust tuning Need for a robust tuning Results obtained for fixed conditions of the simulation What happens with stronger disturbances, more noise, smaller fault ? →Simulation depends on a set of environmental variables xe
5 hyperparameters + 2 environmental variables
IFAC WC 2011 - J.Marzat - 01/09/2011 - 11/15
Robust tuning – proposed solution Continuous minimax optimization Search for optimal tuning for worst-case environmental variables b xc , b xe = arg min max y (xc , xe ) xc ∈Xc xe ∈Xe
Sketch of proposed algorithm Transform the initial problem into
min τ xc,τ
y (xc , xe ) < τ, ∀xe Iterative relaxationc : 1 2 3 4
Draw a new xe , append it to Re Find a minimum b xc for all explored xe ∈ Re with EGO Find a maximum b xe for b xc with EGO Check convergence, repeat if necessary
[c] derived from K. Shimizu and E. Aiyoshi, Necessary conditions for min-max problems and algorithms by a relaxation procedure, IEEE TAC, 1980 IFAC WC 2011 - J.Marzat - 01/09/2011 - 12/15
Min-max tuning for aircraft fault diagnosis application
Results for 100 runs
mean
std.
False-alarm rate Non-detection rate Number of simulations
0 0.125 168
0.002 0.047 26
IFAC WC 2011 - J.Marzat - 01/09/2011 - 13/15
Estimation of the worst-case
IFAC WC 2011 - J.Marzat - 01/09/2011 - 14/15
Summary and future work Summary Automatic tuning with Kriging and Bayesian Optimization Tuning of complete FDI schemes for dynamical systems : simultaneous adjustment of hyperparameters of residual-generation and residual-evaluation strategies New robust tuning algorithm in the worst-case sense Few runs of the simulation required Generic: applicable to many engineering design problems Future work Consider higher-dimensional problems (in both dimensions) More complex constraints on the cost function
IFAC WC 2011 - J.Marzat - 01/09/2011 - 15/15
Illustration of EGO
Iteration 1
Iteration 2
Iteration 3 IFAC WC 2011 - J.Marzat - 01/09/2011 - 15/15