CLOP: Confident Local Optimization for Noisy

CLOP: Confident Local Optimization for Noisy. Black-Box Parameter Tuning. Rémi Coulom. November, 2011. Advances in Computer Games 13 ...
491KB taille 3 téléchargements 277 vues
CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning R´emi Coulom

November, 2011 Advances in Computer Games 13

Introduction The CLOP Algorithm Experiments Conclusion

Problem Definition Presentation Outline

Noisy Black-Box Optimization Problem: Optimizing a game-playing Program Heuristic parameters: evaluation, search, . . . Observation: game outcomes (win or loss (or draw)) Objective: maximize probability of winning 1 Win Loss

Two sub-problems x2

-1 -1

x1

1

Estimate optimal ~x

2

Choose next ~x

1 R´ emi Coulom

CLOP: Confident Local Optimization

2 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Problem Definition Presentation Outline

Presentation Outline

CLOP algorithm

Experiments on artificial data

R´ emi Coulom

CLOP: Confident Local Optimization

3 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

CLOP: Method for Optimum Estimation

1

P

0 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

4 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 1: Data

1

511 /

Win/Loss

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

5 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 2: Quadratic Logistic Regression

1

511 /

Win/Loss Regression

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

6 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 3: Lower Confidence Bound

1

511 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

7 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 4: Discard Samples Below LCB = µ − Hσ

1

511 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

8 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 5: Re-compute Regression with Remaining Samples

1

511 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

9 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Step 6: Iterate Until Convergence

1

511 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

10 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1 /

Win/Loss Regression Mean LCB

P

0 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

11 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

1 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

12 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

3 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

13 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

7 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

14 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

15 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

15 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

31 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

16 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

63 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

17 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

127 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

18 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

255 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

19 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

511 /

Win/Loss Regression Mean LCB

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

20 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Maximum Estimation Sampling Policy

9999

Sampling Policy: Density = Weight

1

511 /

Win/Loss

P

N

0

1 -1

Parameter Value

R´ emi Coulom

CLOP: Confident Local Optimization

1

21 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Smooth 1D function 1

P

0 -1

Parameter

R´ emi Coulom

CLOP: Confident Local Optimization

+1

22 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Smooth 1D function

100 CLOP UCT CEM

regret

10− 5 10

games

R´ emi Coulom

CLOP: Confident Local Optimization

107

23 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Rosenbrock 1

x2

-1 x1

-1 R´ emi Coulom

1

CLOP: Confident Local Optimization

24 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Rosenbrock

100 CLOP UCT CEM

regret

10− 5 10

games

R´ emi Coulom

CLOP: Confident Local Optimization

107

25 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Angle 1

P

0 -1

Parameter

R´ emi Coulom

CLOP: Confident Local Optimization

+1

26 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Angle

100 CLOP UCT CEM

regret

10− 5 10

games

R´ emi Coulom

CLOP: Confident Local Optimization

107

27 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Discontinuous Function 1

P

0 -1

Parameter

R´ emi Coulom

CLOP: Confident Local Optimization

+1

28 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Smooth 1D Function Rosenbrock Angle Discontinuous Function

Discontinuous Function

100 CLOP UCT CEM

regret

10− 5 10

games

R´ emi Coulom

CLOP: Confident Local Optimization

107

29 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Conclusion Summary of CLOP Much faster black-box optimizer than state of the art in games Foolproof: no tricky meta-parameters Popular freeware: http://remi.coulom.free.fr/CLOP/ Future Work High-dimensional problems: more regularization, sparsity Apply to less noisy or noiseless problems (BBOB) Apply CLOP principle to other forms of regression Optimization from self-play Prove convergence R´ emi Coulom

CLOP: Confident Local Optimization

30 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Extra Slide: Code procedure QuadraticCLOP(H, ~x1 , y1 , . . . , ~xN , yN ) w0 ← λ~x .1 . a function of ~x that returns 1 W0 ← N k←0 repeat w ← λ~x . minki=0 wi (~x ) . weight function k ←k +1 qk ← QuadraticLogisticRegression(w , ~x1 , y1 , . . . , ~xN , yN ) µk ← LogisticMean(w , ~x1 , y1 , . . . , ~xN , yN ) σk ← ConfidenceDeviation(w , ~x1 , y1 , . . . , ~xN , yN ) wk ← λ~x .e (qk (~x)−µk )/(Hσk )  Wk ← ΣN xi ), wk (~xi ) i=1 min w (~ until Wk > 0.99 × Wk−1 ~xN+1 ← Random(w ) ~x˜ ← ΣN+1 xi ) xi )~xi /ΣN+1 i=1 w (~ i=1 w (~ end procedure

R´ emi Coulom

. next sample, distributed like w . estimated optimal

CLOP: Confident Local Optimization

31 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Asymptotic Rate of Convergence (Intuitively)

O N −1/2 δ

δ

Bias only. Regret = O

δ δ4



δ

 Variance only. Regret = O N −1 δ −2 .

 .

 Optimal asymptotic bias-variance tradeoff: regret = O N −2/3 .

R´ emi Coulom

CLOP: Confident Local Optimization

32 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Effect of Meta-Parameter H TODO: name of axes. 10−1

10−2

10−3

10−4

10−5 1 10

H H H H H H H H H

= 12 = 10 =8 =6 =4 =3 =2 =1 = 0.8N 1/6

102

103

104

105

106

107

Conclusion (of many other experiments): H = 3 works well in practice. R´ emi Coulom

CLOP: Confident Local Optimization

33 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Extra Slide: Many algorithms 100

10−1

10−2

10−3

10−4

10−5

10−6 1 10

Quadratic CLOP, H = 3 Quadratic CLOP, H = 0.8N 1/6 Cubic CLOP, H = 0.8N 1/4 RSPSA SPSA∗ CEM (Chaslot et al.) CEM (Hu & Hu) UCT 102

103

104

R´ emi Coulom

105

106

CLOP: Confident Local Optimization

107

34 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Extra Slide: 1D Problems

(a) Log

(b) Flat

(c) Power

(d) Angle

(e) Step

R´ emi Coulom

CLOP: Confident Local Optimization

35 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Effect of Meta-Parameter H on Power 10−1

10−2

10−3

10−4

10−5 1 10

H H H H H H H H

= 12 = 10 =8 =6 =4 =3 =2 =1

102

103

R´ emi Coulom

104

105

106

CLOP: Confident Local Optimization

107

36 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Extra Slide: 2D Problems

(f) Rosenbrock

(g) Correlated

R´ emi Coulom

(h) Log2

CLOP: Confident Local Optimization

37 / 38

Introduction The CLOP Algorithm Experiments Conclusion

Extra Slide: Performance on many problems

(i) Log

(j) Log2

(k) Log5

(l) Flat

(m) Rosenbrock

(n) Rosenbrock2

(o) Rosenbrock5

(p) Power

(q) Correlated

(r) Correlated2

(s) Angle

Quadratic CLOP (H = 3),

UCT,

R´ emi Coulom

(t) Step

CEM (Hu & Hu).

CLOP: Confident Local Optimization

38 / 38