Kernel based stochastic gradient - Contributions of Jean-Sébastien Roy

Aug 17, 2006 - ... dimensions;. ▻ Online tuning of the stepsizes (currently empirical). ... IEEE International Conference on Computer Vision, pages. 464–471.
349KB taille 2 téléchargements 119 vues
Kernel based stochastic gradient Application to option pricing

Kengy Barty1

Pierre Girardeau1,2 Jean-S´ebastien Roy1 Cyrille Strugarek1,2,3 1 EDF

2 Ecole

Recherche & D´ eveloppement

Nationale Sup´ erieure de Techniques Avanc´ ees

3 Ecole

Nationale des Ponts et Chauss´ ees

17 August 2006

Introduction Among the numerous methods used to price American-style options, we focus on methods that first discretize time such as: I

Quantization [Bally et al., 2002];

I

(Approximate) Dynamic Programming [Van Roy and Tsitsiklis, 2001];

I

Or the regression method of [Longstaff and Schwartz, 2001, Carriere, 1996].

Beside the time discretization, these methods require some kind of state space discretization, usually through an a priori choice of functional basis used to represent the value of the option. By choosing an a priori functional basis, these methods give up optimality. My objective will be to present an alternative, nonparametric algorithm to solve efficiently dynamic programming problems without a priori discretisation.

Presentation outline

Stochastic approximation Application to pricing Numerical techniques used

Bermudan option pricing Problem description

We price a Bermudan put option where exercise dates are restricted to equispaced dates t in 0, . . . , T . The underlying price Xt follows a discretized [Samuelson, 1965] dynamic: ln

Xt+1 1 = r − σ 2 + ση t Xt 2

where (η t ) is a Gaussian white noise of variance unity and r is the risk free interest rate. The strike is s, and the intrinsic option price is g (x) = max (0, s − x) when the price is x. Let α = e −r denote the discount factor, and let x0 denote the price at t = 0.

Bermudan option pricing Objective, Dynamic Programming Equations

Our objective is to evaluate the value of the option: max E (ατ g (Xτ )) τ

where τ is taken among the stopping times adapted to the filtration induced by the price process (Xt ). Let Qt (x) denote the expected gain at t if Xt = x and if we do not exercise the option. Since the option must be exercised before T + 1, we have: QT (x) = 0. Therefore, ∀t < T : Qt (x) = αE ( max (g (Xt+1 ) , Qt+1 (Xt+1 ))| Xt = x) In this set of stochastic dynamic programming equations, as opposed to the classical schemes, maximisation occurs inside the expectation, so that estimation of Qt can be performed by Monte-Carlo simulation.

Resolution of the stochastic DP problem Approximate Dynamic Programing, Quantization

We cannot solve this set of equations directly by non approximated dynamic programming since the expectation that can typically only be estimated through Monte-Carlo simulation and Qt is a function, i.e., an infinite dimensional object. Approximate dynamic programming [Bellman and Dreyfus, 1959], or the regression method of [Longstaff and Schwartz, 2001]: alleviate the infinite dimension by parametrizing functions Qt . Let (fi ) be a predefined family of functions of the state x and A = (ai ) be a parameter vector. We search Qt among the linear combinations of (fi ): X Qt (x) = ai fi (x) i

Usually very efficient but not optimal, and the error is hard to measure. Quantization [Bally et al., 2002] is a very interesting sub-case, with optimality characteristics for conditional expectation approximation.

Resolution of the stochastic DP problem Robbins-Monro algorithm

For a fixed x, we could solve the fixed point problem using a the scheme of [Robbins and Monro, 1951], which estimates the expectation Qt (x) = αE ( max (g (Xt+1 ) , Qt+1 (Xt+1 ))| Xt = x) through random samples (ytn (x)) of Xt+1 | Xt = x, and recursively average the values obtained. Let:  n−1 ∆n−1 (x, y ) = max g (y ) , Qt+1 (y ) − Qtn−1 (x) t On all x, and all t we perform: Qtn (x) = Qtn−1 (x) + ρn ∆n−1 (x, ytn (x)) t P P with ρn ↓ 0, n ρn = ∞ and n ρ2n < ∞.

Resolution of the stochastic DP problem Temporal differences

Alternatively, the Temporal Difference algorithm of [Sutton, 1988] rewrites the Robbins-Monro algorithm as:  Qtn (·) = Qtn−1 (·) + ρn E ∆n−1 (Xt , Xt+1 ) δXt (·) t And instead of updating the Qtn function for all states x, randomizes the updated state at each iteration. Let (xtn ) be random paths of X. The algorithm is: (  n Qtn−1 (xtn ) + ρn ∆n−1 xtn , xt+1 if x = xtn , t n Qt (x) = Qtn−1 (x) else. This algorithm is still not implementable when the state space is continuous and not practical when it is discrete with a large cardinality (as with the discretization of a high dimensional state space).

Approximation of a Dirac When the state space is continuous, the TD algorithm cannot be implemented since the updates are pointwise in xn :  n Qtn (·) = Qtn−1 (·) + ρn ∆n−1 xtn , xt+1 δxtn (·) t We suggest to approximate the Dirac δxtn (·) using a kernel of bandwidth n ↓ 0, using the property:   1 f (·) = E (f (X) δX (·)) = lim E f (X) Kn (X, ·) n→∞  | n {z } 4

mollifier

(1/1)*exp(-(x/1)**2) (1/0.5)*exp(-(x/0.5)**2) (1/0.25)*exp(-(x/0.25)**2)

3.5 3 2.5 2 1.5 1 0.5 0 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Figure: Approximations with Gaussian kernels ( ∈ {1, 0.5, 0.25}).

Temporal Differences with kernels We therefore propose the following temporal difference learning with kernels algorithm: n Qtn (·) = Qtn−1 (·) + ρn ∆n−1 xtn , xt+1 t

Usually Kn (xtn , ·) = K



xtn −· ηn



 1 Kn (xtn , ·) n

with n = ηnd and K a d-dim. kernel.

This algorithm avoids the a priori parametrization of the function Qt , and we prove its convergence in [Barty et al., 2005]. Moreover it is easily implementable, requiring only at each iteration n the storage of the vector αtn := ρnn ∆n−1 xtn , xt+1 , the vectors xtn t and the shape of Kn (usually defined by its bandwidth n ). so that: X  Qt (x) = αti Ki xti , x i≤n

Hypotheses for convergence Let rtn (x) = E ( ∆nt (Xt , Xt+1 )| Xt = x), it holds that

 

I ∃b1 ≥ 0 s.t. rtn−1 (·) − E rtn−1 (Xt ) 1 Kn (Xt , ·) ≤ n

n−1  b1 ηn 1 + rt (·) 

2 

1 I ∃b2 ≥ 0 s.t. E Kn (Xt , ·) ≤ b2 n The sequences (ρn ), (n ) and (ηn ) must be positive and satisfy: P I ρn = ∞, P (ρn )2 I < ∞, P n I and b1 ρn ηn < ∞. These hypotheses are quite similar to those found in other stochastic approximation algorithms with biased estimates such as in [Kiefer and Wolfowitz, 1952]. For example, if S = Rd , suitable sequences are ρn = n1 , n =

√1 n

1

and ηn = nd .

Bermudan option pricing Effective algorithm

Initialize the Qt0 (·) = 0, ∀t. Then, for each iteration n > 0: 1. draw a sample price path (xtn ), 2. for all t = T − 1, . . . , 0, update Qtn by: n Qtn (·) = Qtn−1 (·) + ρn ∆n−1 xtn , xt+1 t

 1 Kn (xtn , ·) n

with   n  ∆n−1 x, x 0 = α max g x 0 , Qt+1 x 0 − Qtn−1 (x) t

Bermudan option pricing 100 iterates

0.5

1 0.8 0.6 0.4 0.2 0

9

0 -0.5

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

Q 100

1 10

9

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

1 10

Q 100 − Q ∗

Bermudan option pricing 1000 iterates

0.5

1 0.8 0.6 0.4 0.2 0

9

0 -0.5

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

1 10

Q 1000

9

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

1 10

Q 1000 − Q ∗

Bermudan option pricing 10000 iterates

0.5

1 0.8 0.6 0.4 0.2 0

9

0 -0.5

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

1 10

Q 10000

9

8

7

6 t

5

4

3

1e-05 0.0001 0.001 0.01 x 0.1 2

1

1 10

Q 10000 − Q ∗

Bermudan option pricing Convergence speed

0.1

0.01

0.001

0.0001

1e-05 1

10

100 ||Qk-Q*||2

1000

10000

Comparisons inside Premia 8 Methodology

This technique has been implemented in Premia 8 (cf. http://www.premia.fr), so that comparisons can be made. Basic comparison methodology: 1. Draw a large number of options (put over minimum in dimension 2, random volatility, maturity, strike and interest rate); 2. Price the options using the different numerical techniques available in Premia; 3. Plot the results. . .

Comparisons inside Premia 8 Price error

1.5 BGRS, 50.000 it. [2.5 s]

1 BGRS, 100.000 it. [6 s] Barraquand−Martineau [1.5 s] Longstaff−Schwartz [1.5 s]

Lions−Regnier [33 s]

Error

0.5

0

−0.5

−1 0.5

1.5

2.5

3.5

4.5

5.5

Comparisons inside Premia 8 Error on hedging values

∆1

∆2 0.1

0.1 Longstaff− Schwartz [1.5 s]

0.08

Barraquand− Martineau [1.5 s]

Lions− Regnier [33 s]

0.08

0.04

BGRS, 50.000 it. [2.5 s]

BGRS, 100.000 it. [6 s]

Lions− Regnier [33 s]

0.02 Error

Error

Barraquand− Martineau [1.5 s]

0.04

0.02

0

0

−0.02

−0.02

−0.04

−0.04

−0.06

−0.06

−0.08

−0.08

−0.1 0.5

Longstaff− Schwartz [1.5 s]

0.06

0.06

1.5

2.5

3.5

4.5

5.5

−0.1 0.5

BGRS, 100.000 it. [6 s]

BGRS, 50.000 it. [2.5 s]

1.5

2.5

3.5

4.5

5.5

Accelerating the computation Recall that at each time step we have to compute: n Qtn (·) = Qtn−1 (·) + ρn ∆n−1 xtn , xt+1 t

 1 Kn (xtn , ·) n

with   n  ∆n−1 x, x 0 = α max g x 0 , Qt+1 x 0 − Qtn−1 (x) t The functions Qtn is represented, at iteration n of the algorithm by a sum of n kernels. The computational complexity is therefore quadratic in the number of iterations. A large speed up can be obtained by the use of the Fast Gauss Transform [Greengard and Strain, 1991, Yang et al., 2003], which, with minor adaptations, can be used in our case to compute the sum of the n kernels in almost constant time.

Accelerating the convergence Three techniques are used to accelerate the convergence: 1. Averaging[Polyak and Juditsky, 1992]: Use large stepsizes ρn for faster convergence. This results in a chaotic behavior which is made up for by averaging the last, say, 10% of the iterates to obtain the final result: ˆ Nt (·) = Q

N X 1 Qnt (·) 0.1N n=0.9N

2. Quasi Monte Carlo sequences: The use of low discrepancy sequences such as [Sobol, 1967] speeds up the estimation convergence of the conditional expectations. 3. Brownian bridge: Helps preserve the low discrepancy property while drawing price paths.

Conclusion Main advantages: I

Nonparametric method for option pricing that does not require an a priori discretization;

I

No restriction on the dynamics of the price process nor on the pay-off;

I

Easy to implement;

I

Can be applied to a large class of dynamic programming problems besides option pricing;

I

Premia 8 provides an efficient implementation in dimensions 1 and 2.

Improvement directions: I

Improve computation times for high dimensions;

I

Online tuning of the stepsizes (currently empirical).

Bibliography I

Bally, V., Pag`es, G., and Printems, J. (2002). First order schemes in the numerical quantization method. Pr´epublications du laboratoire de probabilit´es et mod`eles al´eatoires, (735):21–41. Barty, K., Roy, J.-S., and Strugarek, C. (2005). Temporal difference learning with kernels. Optimization Online. http://www.optimization-online.org/DB_HTML/2005/ 05/1133.html. Bellman, R. and Dreyfus, S. (1959). Functional approximations and dynamic programming. Math tables and other aides to computation, 13:247–251.

Bibliography II Carriere, J. F. (1996). Valuation of the early-exercise price for derivative securities using simulations and splines. Insurance: Mathematics and Economics, 19:19–30. Greengard, L. and Strain, J. (1991). The fast gauss transform. SIAM Journal on Scientific and Statistical Computing. Kiefer, J. and Wolfowitz, J. (1952). Stochastic estimation of the maximum of a regression function. Annals of Mathematical Statistics, 23:462–466. Longstaff, F. A. and Schwartz, E. S. (2001). Valuing american options by simulation: A simple least squares approach. Rev. Financial Studies, 14(1):113–147.

Bibliography III Polyak, B. T. and Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 30:838–355. Robbins, H. and Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 22:400–407. Samuelson, P. A. (1965). Rational theory of warrant pricing. Industrial Management Review, 6:13–31. Sobol, I. (1967). On the distribution of points in a cube and the approximate evaluation of integrals. Zh. Vychisl. Mat. i Mat. Phys., 7:784–802.

Bibliography IV

Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3:9–44. Van Roy, B. and Tsitsiklis, J. N. (2001). Regression methods for pricing complex american-style options. IEEE Trans. on Neural Networks, 12(4):694–703. Yang, C., Duraiswami, R., Gumerov, N., and Davis, L. (2003). Improved fast gauss transform and efficient kernel density estimation. IEEE International Conference on Computer Vision, pages 464–471.