Momentum Strategies

Sep 30, 2011 - 1.4.1 Estimating the optimal filter for a given trading date . .... A.1.2 The interior-point algorithm . .... performance and to limit risk of portfolios.
3MB taille 4 téléchargements 311 vues
University of Paris 7 - Lyxor Asset Management

Master 2 project

Momentum Strategies: From novel Estimation Techniques to Financial Applications

Author: Tung-Lam Dao

Supervisor: Prof. Thierry Roncalli

September 30, 2011

Contents Acknowledgments

1

Introduction

3

1 Trading Strategies with L1 Filtering 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . 1.3 L1 filtering schemes . . . . . . . . . . . . . . . . . . . . 1.3.1 Application to trend-stationary process . . . . 1.3.2 Extension to mean-reverting process . . . . . . 1.3.3 Mixing trend and mean-reverting properties . . 1.3.4 How to calibrate the regularization parameters? 1.4 Application to momentum strategies . . . . . . . . . . 1.4.1 Estimating the optimal filter for a given trading 1.4.2 Backtest of a momentum strategy . . . . . . . . 1.5 Extension to the multivariate case . . . . . . . . . . . 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

6 6 7 8 8 9 13 13 18 18 20 21 22

2 Volatility Estimation for Trading Strategies 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Range-based estimators of volatility . . . . . . . . . . . . . . . . . 2.2.1 Range based daily data . . . . . . . . . . . . . . . . . . . . 2.2.2 Basic estimator . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 High-low estimators . . . . . . . . . . . . . . . . . . . . . . 2.2.4 How to eliminate both drift and opening effects? . . . . . . 2.2.5 Numerical simulations . . . . . . . . . . . . . . . . . . . . . 2.2.6 Backtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Estimation of realized volatility . . . . . . . . . . . . . . . . . . . . 2.3.1 Moving-average estimator . . . . . . . . . . . . . . . . . . . 2.3.2 IGARCH estimator . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Extension to range-based estimators . . . . . . . . . . . . . 2.3.4 Calibration procedure of the estimators of realized volatility 2.4 High-frequency volatility estimators . . . . . . . . . . . . . . . . . . 2.4.1 Microstructure effect . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

24 24 25 25 26 29 31 32 38 45 45 46 48 48 53 55

ii

. . . . . . . . . . . . . . . . . . . . . . . . date . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

2.5

2.4.2 Two time-scale volatility estimator . . . . . . . . . . . . . . . 2.4.3 Numerical implementation and backtesting . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Support Vector Machine in Finance 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Support vector machine at a glance . . . . . . . . . . . . 3.2.1 Basic ideas of SVM . . . . . . . . . . . . . . . . . 3.2.2 ERM and VRM frameworks . . . . . . . . . . . . 3.3 Numerical implementations . . . . . . . . . . . . . . . . 3.3.1 Dual approach . . . . . . . . . . . . . . . . . . . 3.3.2 Primal approach . . . . . . . . . . . . . . . . . . 3.3.3 Model selection - Cross validation procedure . . . 3.4 Extension to SVM multi-classification . . . . . . . . . . 3.4.1 Basic idea of multi-classification . . . . . . . . . . 3.4.2 Implementations of multiclass SVM . . . . . . . . 3.5 SVM-regression in finance . . . . . . . . . . . . . . . . . 3.5.1 Numerical tests on SVM-regressors . . . . . . . . 3.5.2 SVM-Filtering for forecasting the trend of signal 3.5.3 SVM for multivariate regression . . . . . . . . . . 3.6 SVM-classification in finance . . . . . . . . . . . . . . . 3.6.1 Test of SVM-classifiers . . . . . . . . . . . . . . . 3.6.2 SVM for classification . . . . . . . . . . . . . . . 3.6.3 SVM for score construction and stock selection . 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 4

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

Analysis of Trading Impact in the CTA strategy 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Trading impact of CTA for single risky asset . . . . . . . . . . . . . 4.2.1 Exponential moving average . . . . . . . . . . . . . . . . . . 4.2.2 Trend-following strategy . . . . . . . . . . . . . . . . . . . . 4.2.3 Trading impact . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Trading impact of CTA for multi-assets . . . . . . . . . . . . . . . 4.3.1 Uncorrelated assets . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Correlated assets . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Analysis of Trading Impact in a Toy model . . . . . . . . . . . . . 4.4.1 Analytical results . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Extension to the stochastic trend model . . . . . . . . . . . . . . . 4.5.1 Model of trend mean-reverting . . . . . . . . . . . . . . . . 4.5.2 Numerical simulation . . . . . . . . . . . . . . . . . . . . . . 4.6 Analysis of the trading impact versus the market evolution . . . . . 4.6.1 Analysis of constant-mixed strategy . . . . . . . . . . . . . 4.6.2 CTA efficiency in function of correlation and Sharpe ratio . 4.6.3 CTA efficiency in function of Sharpe ratio and asset number iii

55 58 59

62 . 62 . 63 . 63 . 68 . 71 . 71 . 78 . 79 . 80 . 80 . 81 . 86 . 86 . 87 . 90 . 94 . 94 . 98 . 101 . 108 . . . . . . . . . . . . . . . . . .

112 112 113 113 114 115 115 115 119 122 122 126 138 138 140 140 140 142 142

4.7

4.6.4 CTA efficiency in function of correlation and asset number . . 145 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Conclusions

149

A Appendix of chaper 1 A.1 Computational aspects of L1 , L2 filters . . . . A.1.1 The dual problem . . . . . . . . . . . . A.1.2 The interior-point algorithm . . . . . . A.1.3 The scaling of smoothing parameter of A.1.4 Calibration of the L2 filter . . . . . . . A.1.5 Implementation issues . . . . . . . . .

152 152 152 154 155 156 158

. . . . . . . . . . . . . . . L1 filter . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

B Appendix of chapter 2 160 B.1 Estimator of volatility . . . . . . . . . . . . . . . . . . . . . . . . . . 160 B.1.1 Estimation with realized return . . . . . . . . . . . . . . . . . 160 C Appendix of chapter 3 C.1 Dual problem of SVM . . . . . . . C.1.1 Hard-margin SVM classifier C.1.2 Soft-margin SVM classifier . C.1.3 ε-SV regression . . . . . . . C.2 Newton optimization for the primal C.2.1 Quadratic loss function . . C.2.2 Soft-margin SVM . . . . . .

. . . . . . .

. . . . . . .

162 162 162 163 164 165 165 166

D Appendix of chapter 4 D.1 Trading impact in the univariate case . . . . . . . . . . . . . . . . . D.1.1 Basic quantities for calculations . . . . . . . . . . . . . . . . D.1.2 Conditional expectation . . . . . . . . . . . . . . . . . . . . D.1.3 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . D.2 Trading impact in the multivariate case . . . . . . . . . . . . . . . D.2.1 Series expansion of the quadratic form of Gaussian variables D.2.2 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . .

. . . . . . .

168 168 168 168 169 169 169 170

iv

. . . . . . . . . . . . . . . . . . . . problem . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

Acknowledgments During the six months unforgettable in the R&D team of Lyxor Management, I have experienced and enjoyed every moments. Apart from all the professional experiences that I have learnt from everyones int the department, I did really appreciate the great ambiance in the team which motivated me everyday. I would like first to thank Thierry Roncalli for his supervision during my stay in the team. I did not ever imagine that I could learn so many interesting things within my internship without his direction and his confidence. Thierry has introduced me the financial concepts of the asset management world in a very interactive way. I would say that I have learnt finance in every single discussion with him. He taught me how to combine learning and practice. For the professional experiences, Thierry has help me to fill the lag in my financial knowledges by allowing me to work on various interesting topics. He made me confident to present my understanding on this field. For the daily life, Thierry has shared his own experiences and teach me as well how to adapt to this new world. I would like to thank Nicolas Gaussel for his warming reception in Quantitative management department, for his confidence and for his encouragements during my stay in Lyxor. I have a chance to work with him on a very interesting topic concerning the CTA strategy which plays an important role in asset management. I would like to thank Benjamin Bruder, my nearest neighbor, for his guide and his supervision along my internship. Informally, Benjamin is almost my co-advisor. I must say that I owe him a lot for all of his patience in every daily discussion in order to teach me and to work out many questions coming up to my projects. I am really graceful for his humorist quality which warm up the ambiance. For all members of the R&D team, I would like to express my gratitude to them for their helps, their advices and everything that they shared with me during my stay. I am really happy to be one of them. Thank Jean-Charles for your friendship, for all daily discussions and for your support for all initiatives in my projects. A great thank to Stephane who always cheer up all the breaks with his intelligent humor. I would say that I have learnt from him the most interesting view of the “Binomial world” . Thank Karl for your explanation to your macro-world. Thank Pierre for all your help on data collection and your passion in all explanation such as the story of “Merrill lynch’s investment clock”. Thank Zelia for very stimulated collaboration on my last project and the great time during our internship. 2

For all persons in the other side of the room, I would like to thank Philippe Balthazard for his comments on my projects and his point of view on financial aspects. Thank Hoang-Phong Nguyen for his help on data base and his support during my stay. There are many other persons that I have chance to be in interaction with but I could not cite here. Thank to my parents, my sister who always believe in me and support me during my deviation to a new direction. In the end, I would like reserve the greatest thank to my wife and my son for their love and daily encouragement. They were always behind me during the most difficult moments of this year.

3

Introduction Within the internship in the Research and Development team of Lyxor Asset Management, we studied novel technologies which are applicable on asset management. We focused on the analysis of some special classes of momentum strategies such as the trend-following strategies or the voltarget strategies. These strategies play a crucial role in the quantitative management as they pretend to optimize the benefit basing on exploitable signals of the market inefficiency and to limit the market risk via an efficient control of the volatility. The objectives of this report are two-fold. We first studied some novel techniques in statistic and signal treatment fields such as trend filtering, daily and high frequency volatility estimator or support vector machine. We employed these techniques to extract interesting financial signals. These signals are used to implement the momentum strategies which will be described in detail in every chapters of this report. The second objective concerns the study of the performance of these strategies based on the general risk-return analysis framework (see B. Bruder and N. Gaussel 7th White Paper, Lyxor). This report is organized as following: In the first chapter, we discuss various implementation of L1 filtering in order to detect some properties of noisy signals. This filter consists of using a L1 penalty condition in order to obtain the filtered signal composed by a set of straight trends or steps. This penalty condition, which determines the number of breaks, is implemented in a constrained least square problem and is represented by a regularization parameter λ which is estimated by a cross-validation procedure. Financial time series are usually characterized by a long-term trend (called the global trend) and some short-term trends (which are named local trends). A combination of these two time scales can form a simple model describing the process of a global trend process with some mean-reverting properties. Explicit applications to momentum strategies are also discussed in detail with appropriate uses of the trend configurations. We next review in the second chapter various techniques for estimating the volatility. We start by discussing the estimators based on the range of daily monitoring data then we consider the stochastic volatility model in order to determine the instantaneous volatility. At high trading frequency, the stock prices are fluctuated by an additional noise, so-called the micro-structure noise. This effect comes from the bidask spread and the short time scale. Within a short time interval, the trading price 4

Trading Strategies with L1 Filtering

does not reflect exactly the equilibrium price determined by the “supply-demand” but bounces between the bid and ask prices. In the second part, we discuss the effect of the micro-structure noise on the volatility estimation. It is a very important topic concerning a large field of “high-frequency” trading. Examples of backtesting on index and stocks will illustrate the efficiency of considered techniques. The third chapter is dedicated to the study of general framework of machinelearning technique. We review the well-known machine learning techniques so-called support vector machine (SVM). This technique can be employed in different contexts such as classification, regression or density estimation according to Vapnik [1998]. Within the scope of this report, we would like first to give an overview of this method and its numerical variation implementation, then bridge it to financial applications such as trend forecasting, the stock selection, sector recognition or score construction.

We finish in Chapter 4 by the performance analysis of CTA strategy. We review first the trend-following strategies within Kalman filter and study the impact of the trend estimator error. We start the discussion with the case of momentum strategy on the single asset case then generalize the analysis to the multi-asset case. In order to construct the allocation strategy, we employ the observed trend which is filtered by exponential moving average. It can be demonstrated that the cumulated return of the strategy can be splited into two important parts. The first one is called “Option Profile” which involves only the current measured trend. This idea is very similar in concept to the straddle profile suggested by Fung and Hsied (2001). The second part is called “Trading Impact“ which involves an integral of the measured trend over the trading period. We focus on the second quantity by estimating its probability distribution function and associated gain and loss expectations. We illustrate how the number of assets and their correlations influence the performance of a strategy via a “toy model”. This study can reveal important results which can be directly tested on a CTA funds.

5

Chapter 1

Trading Strategies with L1 Filtering In this chapter, we discuss various implementation of L1 filtering in order to detect some properties of noisy signals. This filter consists of using a L1 penalty condition in order to obtain the filtered signal composed by a set of straight trends or steps. This penalty condition, which determines the number of breaks, is implemented in a constrained least square problem and is represented by a regularization parameter λ which is estimated by a cross-validation procedure. Financial time series are usually characterized by a long-term trend (called the global trend) and some short-term trends (which are named local trends). A combination of these two time scales can form a simple model describing the process of a global trend process with some mean-reverting properties. Explicit applications to momentum strategies are also discussed in detail with appropriate uses of the trend configurations. Keywords: Momentum strategy, L1 filtering, L2 filtering, trend-following, meanreverting.

1.1

Introduction

Trend detection is a major task of time series analysis from both mathematical and financial point of view. The trend of a time series is considered as the component containing the global change which is in contrast to the local change due to the noise. The procedure of trend filtering concerns not only the problem of denoising but it must take into account also the dynamic of the underlying process. That explains why mathematical approaches to trend extraction have a long history and this subject still gives a great interest in the scientific community 1 . In an investment perspective, trend filtering is the core of most momentum strategies developed in the asset management industry and the hedge funds community in order to improve performance and to limit risk of portfolios. 1

For a general review, see Alexandrov et al. (2008).

6

Trading Strategies with L1 Filtering

The paper is organized as follows. In section 2, we discuss the trend-cycle decomposition of time series and review general properties of L1 and L2 filtering. In section 3, we describe the L1 filter with its various extensions and the calibration procedure. In section 4, we apply L1 filters to some momentum strategies and present the results of some backtests with the S&P 500 index. In section 5, we discuss the possible extension to the multivariate case and we conclude in the last section.

1.2

Motivations

In economics, the trend-cycle decomposition plays an important role to describe a non-stationary time series into permanent and transitory stochastic components. Generally, the permanent component is assimilated to a trend whereas the transitory component may be a noise or a stochastic cycle. Moreover, the literature on business cycle has produced a large number of empirical research on this topic (see for example Cleveland and Tiao (1976), Beveridge and Nelson (1991), Harvey (1991) or Hodrick and Prescott (1997)). These last authors have then introduced a new method to estimate the trend of long-run GDP. The method widely used by economists is based on L2 filtering. Recently, Kim et al. (2009) have developed a similar filter by replacing the L2 penalty function by a L1 penalty function. Let us consider a time series yt which can be decomposed by a slowly varying trend xt and a rapidly varying noise εt process: yt = xt + εt Let us first remind the well-known L2 filter (so-called Hodrick-Prescott filter). This scheme consists to determine the trend xt by minimizing the following objective function: n−1 n � 1� (xt−1 − 2xt + xt+1 )2 (yt − xt )2 + λ 2 t=1

t=2

with λ > 0 the regularization parameter which control the competition between the smoothness of xt and the residual yt −xt (or the noise εt ). We remark that the second term is the discrete derivative of the trend xt which characterizes the smoothness of the curve. Minimizing this objective function gives a solution which is the trade-off between the data and the smoothness of its curvature. In finance, this scheme does not give a clear signature of the market tendency. By contrast, if we replace the L2 norm by the L1 norm in the objective function, we can obtain more interesting properties. Therefore, Kim et al. (2009) propose to consider the following objective function: n−1 n � 1� 2 |xt−1 − 2xt + xt+1 | (yt − xt ) + λ 2 t=1

t=2

This problem is closely related to the Lasso regression of Tibshirani (1996) or the L1 regularized least square problem of Daubechies et al. (2004). Here, the fact of taking the L1 norm will impose the condition that the second derivation of the filtered signal 7

Trading Strategies with L1 Filtering

must be zero. Hence, the filtered signal is composed by a set of straight trends and breaks2 . The competition between these two terms in the objective function turns to the competition between the number of straight trends (or number of breaks) and the closeness to the raw data. Therefore, the smoothing parameter λ plays an important role for detecting the number of breaks. In the later, we present briefly how the L1 filter works for the trend detection and its extension to mean-reverting processes. The calibration procedure for λ parameter will be also discussed in detail.

1.3 1.3.1

L1 filtering schemes Application to trend-stationary process

The Hodrick-Prescott scheme discussed in last section can be rewritten in the vectorial space Rn and its L2 norm �·�2 as: 1 �y − x�22 + λ �Dx�22 2 where y = (y1 , . . . , yn ), x = (x1 , . . . , xn ) ∈ Rn and the D operator is the (n − 2) × n matrix:   1 −2 1   1 −2 1     . .. (1.1) D=      1 −2 1 1 2 1 The exact solution of this estimation is given by � �−1 x� = I + 2λD � D y The explicit expression of x� allows a very simple numerical implementation with sparse matrix. As L2 filter is a linear filter, the regularization parameter λ is calibrated by comparing to the usual moving-average filter. The detail of the calibration procedure is given in Appendix A.1.4. The idea of L2 filter can be generalized to a lager class so-called Lp filter by using Lp penalty condition instead of L2 penalty. This generalization is already discussed in the work of Daubechies et al. (2004) for the linear inverse problem or in the Lasso regression problem by Tibshirani et al. (1996). If we consider a L1 filter, the objective function becomes: n n−1 � 1� (yt − xt )2 + λ |xt−1 − 2xt + xt+1 | 2 t=1 t=2 2

A break is the position where the trend of signal changes.

8

Trading Strategies with L1 Filtering

which is equivalent to the following vectorial form: 1 �y − x�22 + λ �Dx�1 2 It has been demonstrated in Kim et al. (2009) that the dual problem of this L1 filter scheme is a quadratic program with some boundary constraints. The detail of this derivation is shown in Appendix A.1.1. In order to optimize the numerical computation speed, we follow Kim et al. (2009) by using a “primal-dual interior point” method (see Appendix A.1.2). In the following, we check the efficient of this technique on various trend-stationary processes. The first model consists of data simulated by a set of straight trend lines with a white noise perturbation:  yt = xt + εt    ε ∼ N �0, σ 2 �   t xt = xt−1 + vt (1.2)     Pr {v � t = vt−1 � } = p ��  Pr vt = b U[0,1] − 12 = 1 − p We present in Figure 2.19 the comparison between L1 − T and HP filtering schemes3 . The top-left graph is the real trend xt whereas the top-right graph presents the noisy signal yt . The bottom graphs show the results of the L1 − T and HP filters. Here, we have chosen λ = 5 258 for the L1 − T filtering and λ = 1 217 464 for HP filtering. This choice of λ for L1 − T filtering is based on the number of breaks in the trend, which is fixed to 10 in this example4 . The second model model is a random walk generated by the following process:  yt = yt−1   � + vt� + εt  εt ∼ N 0, σ 2 (1.3) Pr {v   � t = vt−1 � } = p 1 ��  Pr vt = b U[0,1] − 2 = 1 − p We present in Figure 1.2 the comparison between L1 − T filtering and HP filtering on this second model5 .

1.3.2

Extension to mean-reverting process

As shown in the last paragraph, the use of L1 penalty on the second derivative gives the correct description of the signal tendency. Hence, similar idea can be applied for other order of the derivatives. We present here the extension of this L1 filtering technique to the case of mean-reverting processes. If we impose now the L1 penalty 3 We consider n = 2000 observations. The parameters of the simulation are p = 0.99, b = 0.5 and σ = 15. 4 We discuss how to obtain λ in the next section. 5 The parameters of the simulation are p = 0.993, b = 5 and σ = 15.

9

Trading Strategies with L1 Filtering

Figure 1.1: L1 − T filtering versus HP filtering for the model (1.2) Signal

Noisy signal

100

100

50

50

0

0

−50

−50 500

1000

1500

2000

500

t

L1 -T filter

HP filter

100

100

50

50

0

0

−50

−50 500

1000

t

1000

1500

2000

500

1000

t

1500

2000

1500

2000

t

Figure 1.2: L1 -T filtering versus HP filtering for the model (1.3) Signal

Noisy signal

1500

1500

1000

1000

500

500

0

0 500

1000

1500

2000

500

1000

t

t

L1 -T filter

HP filter

1500

1500

1000

1000

500

500

0

1500

2000

1500

2000

0 500

1000

1500

2000

500

t

1000

t

10

Trading Strategies with L1 Filtering

condition to the first derivative, we can expect to get the fitted signal with zero slope. The cost of this penalty will be proportional to the number of jumps. In this case, we would like to minimize the following objective function: n

n

t=1

t=2

� 1� |xt − xt−1 | (yt − xt )2 + λ 2 or in the vectorial form:

1 �y − x�22 + λ �Dx�1 2 Here the D operator is (n − 1) × n matrix which is the discrete version of the first order derivative:   −1 1 0  0 −1 1  0     . .. D= (1.4)     −1 1 0  −1 1 We may apply the same minimization algorithm as previously (see Appendix A.1.1). To illustrate that, we consider the model with step trend lines perturbed by a white noise process:  yt = xt �+ εt �    εt ∼ N 0, σ 2 (1.5) Pr � {xt = xt−1 } = p ��   �  Pr xt = b U[0,1] − 12 = 1 − p

We employ this model for testing the L1 − C filtering and HP filtering adapted to the first derivative6 , which corresponds to the following optimization program: n

n

t=1

t=2

� 1� (xt − xt−1 )2 (yt − xt )2 + λ min 2 In Figure 1.3, we have reported the corresponding results7 . For the second test, we consider a mean-reverting process (Ornstein-Uhlenbeck process) with mean value following a regime switching process:  yt = yt−1   � + θ(x � t − yt−1 ) + εt  εt ∼ N 0, σ 2 (1.6) Pr � {xt = xt−1   � } = p 1 ��  Pr xt = b U[0,1] − 2 = 1 − p

Here, µt is the process which characterizes the mean value and θ is inversely proportional to the return time to the mean value. In Figure 1.4, we show how the L1 − C filter can capture the original signal in comparison to the HP filter8 . 6

We use the term HP filter in order to keep homogeneous notations. However, we notice that this filter is indeed the FLS filter proposed by Kalaba and Tesfatsion (1989) when the exogenous regressors are only a constant. 7 The parameters are p = 0.998, b = 50 and σ = 8. 8 For the simulation of the Ornstein-Uhlenbeck process, we have chosen p = 0.9985, b = 20, θ = 0.1 and σ = 2

11

Trading Strategies with L1 Filtering

Figure 1.3: L1 − C filtering versus HP filtering for the model (1.5) Signal

Noisy signal

80

80

60

60

40

40

20

20

0

0

−20

−20

−40

−40 500

1000

1500

2000

500

1000

t

t

L1 -C filter

HP filter

80

80

60

60

40

40

20

20

0

0

−20

−20

−40

1500

2000

1500

2000

−40 500

1000

1500

2000

500

t

1000

t

Figure 1.4: L1 − C filtering versus HP filtering for the model (1.6) Signal

Noisy signal

40

40

30

30

20

20

10

10

0

0

−10

−10

−20

500

1000

1500

−20

2000

500

t L1 -C filter 30

30

20

20

10

10

0

0

−10

−10 1000

2000

1500

2000

HP filter 40

500

1500

t

40

−20

1000

1500

−20

2000

t

500

1000

t

12

Trading Strategies with L1 Filtering

1.3.3

Mixing trend and mean-reverting properties

We now combine the two schemes proposed above. In this case, we define two regular�n−1 ization parameters λ and λ corresponding to two penalty conditions 1 2 t=1 |xt − xt−1 | �n−1 and t=2 |xt−1 − 2xt + xt+1 |. Our objective function for the primal problem becomes now: n

n−1

n−1

t=1

t=1

t=2

� � 1� |xt−1 − 2xt + xt+1 | |xt − xt−1 | + λ2 (yt − xt )2 + λ1 2 which can be again rewritten in the matrix form: 1 �y − x�22 + λ1 �D1 x�1 + λ2 �D2 x�1 2 where the D1 and D2 operators are respectively the (n − 1) × n and (n − 2) × n matrices defined in equations (1.4) and (1.1). In Figures 1.5 and 1.6, we test the efficiency of the mixing scheme on the straight trend lines model (1.2) and the random walk model (1.3)9 . Figure 1.5: L1 − T C filtering versus HP filtering for the model (1.2) Signal

Noisy signal

100

100

50

50

0

0

−50

−50

−100

−100 500

1000

1500

2000

500

1000

t 100

L1 -TC filter

50

0

0

−50

−50

−100

−100 1000

1500

2000

500

1000

1500

2000

t

t

1.3.4

2000

HP filter

100

50

500

1500

t

How to calibrate the regularization parameters?

As shown above, the trend obtained from L1 filtering depends on the parameter λ of the regularization procedure. For large values of λ, we obtain the long-term trend of 9

For both models, the parameters are p = 0.99, b = 0.5 and σ = 5.

13

Trading Strategies with L1 Filtering Figure 1.6: L1 − T C filtering versus HP filtering for the model (1.3) Signal

Noisy signal

1500

1500

1000

1000

500

500

0

0

−500

−500 500

1000

1500

2000

500

t

1000

500

500

0

0

−500

−500 1000

2000

1500

1500

2000

HP filter

1500

1000

500

1500

t

L1 -TC filter

1500

1000

2000

500

1000

t

t

the data while for small values of λ, we obtain short-term trends of the data. In this paragraph, we attempt to define a procedure which permits to do the right choice on the smoothing parameter according to our need of trend extraction. A preliminary remark For small value of λ, we recover the original form of the signal. For large value of λ, we remark that there exists a maximum value λmax above which the trend signal has the affine form: xt = α + βt where α and β are two constants which do not depend on the time t. The value of λmax is given by: �� � �−1 � � � � λmax = � Dy DD � � ∞

We can use this remark to get an idea about the order of magnitude of λ which should be used to determine the trend over a certain time period T . In order to show this idea, we take the data over the total period T . If we want to have the global trend on this period, we fix λ = λmax . This λ will gives the unique trend for the signal over the whole period. If one need to get more detail on the trend over shorter periods, we can divide the signal into p time intervals and then estimate λ

14

Trading Strategies with L1 Filtering

via the mean value of all the λimax parameter: p

1� i λ= λmax p i=1

In Figure 1.7, we show the results obtained with p = 2 (λ = 1 500) and p = 6 (λ = 75) on the S&P 500 index. Figure 1.7: Influence of the smoothing parameter λ 7.6

S&P 500 λ =999 λ =15

7.4

7.2

7

6.8

6.6 2007

2008

2009

2010

2011

Moreover, the explicit calculation of a Brownian motion process gives us the scaling law of the the smoothing parameter λmax . For the trend filtering scheme, λmax scales as T 5/2 while for the mean-reverting scheme, λmax scales as T 3/2 (see Figure 1.8). Numerical calculation of these powers for 500 simulations of the model (1.3) gives very good agreement with the analytical result for Brownian motion. Indeed, we obtain empirically that the power for L1 − T filter is 2.51 while the one for L1 − C filter is 1.52. Cross validation procedure In this paragraph, we discuss how to employ a cross-validation scheme in order to calibrate the smoothing parameter λ of our model. We define two additional parameters which characterize the trend detection mechanism. The first parameter T1 is the width of the data windows to estimate the optimal λ with respect to our target strategy. This parameter controls the precision of our calibration. The second parameter T2 is used to estimate the prediction error of the trends obtained in the 15