Adaptive Portfolio Trading Strategies Using Genetic Algorithms

Because the requirements for portfolio trading strategies focus on portfolio risk ...... using the tournament selection method, with the roulette-selection performing ...
3MB taille 1 téléchargements 321 vues
Chapter 12 Adaptive Portfolio Trading Strategies Using Genetic Algorithms Arthur Rabatin Rabatin Investment Technology Ltd 11 Grosvenor Place, 5th Floor, Spectra Capital Ltd London SW1X 7HH England, United Kingdom http://www.rabatin.com [email protected] Abstract This paper describes a Genetic Algorithm based implementation of adaptive trading models, specifically designed for trading of multicurrency trading portfolios. The paper describes the aspects of the decision making process that must be incorporated into the trading model in order to accurately simulate the decisions a human portfolio trader is required to make in this position. The paper describes the different types of learning processes for market timing and risk management and how these can be incorporated into the same GA based learning process. The basic concept of an distributed, object oriented learning process is demonstrated, as well as specific fitness value calculations to increase consistency and predictability of portfolio trading performance, which the reader can implement in her own testing and development procedure. Two different concepts of designing the adaptive process are shown, with the effect described in the model portfolio described below. The performance of such an adaptive system is demonstrated using a diversified foreign exchange trading portfolio, which yields acceptable levels of risk adjusted return, under realistic assumptions of portfolio constraints and transaction costs. ©1999 by CRC Press LLC

12.1 Introduction Portfolio Trading deals with trading decisions made within the context of a diversified portfolio. The aim of any such trading operation is to achieve an above average return on the available trading capital. More precisely, the aim is to achieve an above average risk-adjusted return, that takes into account the accepted risk parameters of the trading desk’s management, shareholders, or even constraints imposed by financial regulators. Because the requirements for portfolio trading strategies focus on portfolio risk and portfolio return, employing market forecasting systems alone (through traditional methods or AI methods, such as Neural Networks) is not sufficient to develop an autonomous, intelligent trading model. Within a portfolio, every trading decision requires a decision to be made on: • market selection • market timing (buy/sell decision) • accepted price risk • portfolio allocation • portfolio risk exposure All these aspects are highly relevant for the performance of the portfolio. It is important to note that every trading decision always includes assumptions on price risk, portfolio allocation and the resulting portfolio risk exposure, even if the decision has not been made explicitly. Risk and allocation decisions have a very significant impact on the performance of the portfolio. Although clearly no risk management strategy can turn a losing trade into a profit (or v.v.), the actual performance is result of a stream of profits and losses. These profits or losses are a function of both the gain/loss in terms of price and the quantity traded. The trading quantity (position size) for each individual portfolio component is a function of the allocation and risk decisions made for that portfolio and instrument. Because each individual portfolio component affects the value of the entire portfolio, a trading ©1999 by CRC Press LLC

decision in one individual market is always a function of the performance in all other instruments traded within the portfolio. As a consequence, the amount gained or lost on an individual trading decision is influenced simultaneously by risk and allocation decisions, and is not independent of other components of the portfolio. If a trading strategy does not incorporate the risk and allocation decision in addition to the market timing model, it would not be possible to replicate the success of the market forecasting model in real-time performance. Whatever the actual performance would be, the decision on the position size would be unrelated to the market timing decision and would turn the forecasting strategy into an unpredictable stream of profits or losses. We have designed a development framework, that allows the development of intelligent trading models, which incorporate all aspects of the trading decision into one, complete decision making process. This Adaptive Portfolio Trading (APT) framework is an object oriented framework providing the underlying Genetic Algorithm framework, as well as the accounting and reporting functions necessary for any trading model. We are employing a parallel, distributed learning process based on a master / slave design. Genetic Algorithms lend themselves very well to an object oriented distributed learning process. Since within a generation, each member of the population is evaluated independent of the other individuals, a network of processes (i.e. slave processes) can evaluate a generation in a parallel process. The master process has the responsibility of preparing the individual trading model objects for the slave workstations to process, and to perform the genetic operations after the generation is completed. 12.2 Portfolio Trading Learning Process - Overview An automatic learning process evolves around a basic function in the form of if then Applied to a real-world environment, such as portfolio trading, both components of the function usually develop a very complex shape. To derive the , a large number of environment ©1999 by CRC Press LLC

data have to be analysed and interpreted, plus data the system generates as a result of it’s own behaviour, i.e. its own trading and performance history. The also represents a large set of possible decisions, that the system takes simultaneously. Most important, this is a decision to buy, sell, or to adjust an existing position, as well as a decision on positions size, portfolio allocation and position risk. For an adaptive model to learn appropriate behaviour, a payoff value must be available, that will allow the system to interpret it’s actions as success or failure. In many robotics applications, such a payoff value may be immediately available after an action is being taken. For portfolio trading systems no immediate feedback is available. Even though clearly we would wish each decision to be as profitable as possible (or at least avoiding losses), we must expect a real-world trading strategy to result in a stream of both profits and losses. In other words, a certain decision strategy might have resulted in a loss today, but it still was the best decision to take because it had the highest long-term expectation. What we should expect is stable, profitable performance over a number of trades, where we can measure the distribution of profits and losses. This essentially implies evaluating performance over a longer time frame. Because each action taken by the trading model includes a number of different decisions, and because the payoff is measured after a stream trading decisions, the performance payoff cannot be directly attributed to a single type of action or decision. This is not necessarily an aspect of machine learning based trading models; as every trading decision - systematic or discretionary - is influenced by multiple factors simultaneously. The challenge of real-time portfolio trading is the nature of the constraints the trader (system) is subject to. While the individual trading decision (including a decision on price risk, portfolio allocation, portfolio risk) is always made for one single market instrument, the defined constraints - such as overall risk thresholds or exposure limits - are defined or the entire portfolio. It cannot be predicted, if such global thresholds are exceeded, how each individual portfolio component is affected. More precisely, we do not know, how the performance of each ©1999 by CRC Press LLC

individual portfolio component is affected by the constraints placed on the portfolio as a whole. The only way to simulate the effect of such constraints is to implement them already in the learning process. In other words, the trading model object used during the learning process, must include the risk management parameters placed upon the system during the real-time execution of this strategy. 12.3 Performance Measurement / Fitness Measurement Choosing the appropriate tool for portfolio performance measurement is also an important tool for selecting a fund, trader or trading system for investment purposes. With hindsight, every investor would like see the highest possible return on the account that could have been achieved under given market circumstances. In reality, for the purpose of selecting a trader or system for investing, the investor must define a level of risk he/she is prepared to take. This risk expectation defines the parameters within which the investor would accept the trader/system to perform. It also serves as threshold to define when the trader/system does not perform as expected. The appropriate notion in this context is that of “risk adjusted return”. It means, any rational investor would expect the highest return given his/her level of accepted risk. Risk is typically defined as variance of returns. This concept has become subject to controversy, because it relies on a defined distribution of portfolio value changes as risk measurement. In reality, the distribution of portfolio value changes does not resemble a normal distribution. As a result, a system relying on estimating risk through variance, will always grossly under-estimate the real risk the system is exposed to. Within a GA based learning process, performance measurement is especially relevant because it also yields the fitness value through which the “survival of the fittest” process is implemented. The choice of fitness value also determines the success of the trading model during cross-validation. Because the GA process has shown to be a very effective optimisation tool, the success of the learning process must always be interpreted relative to the evaluation on out-of-sample data periods. ©1999 by CRC Press LLC

In terms of a portfolio trading model, we are therefore interested in the future performance of the portfolio, when applying the parameters and rules the system has learned during the training process. Because such performance can never be perfectly predicted, the investor into a trading strategy therefore seeks consistency of performance. The performance benchmark must therefore measure this consistency. To create a trading model that adapts without human interference, the performance benchmark must also measure the absolute level of performance, relative to the expected return and the accepted risk. Because the accepted risk is largely a userdefined value (because it is a result of each investor’s preference), the trading model must balance return with this risk level. For evaluating a trading system’s performance consistency in an automatic, self-learning process the fitness must take into account the time structure of performance (consistency of performance) as well as the absolute level of performance. The target function of the learning process applied in our trading models is based on a user-defined Return Path (RP). This return path is a monthly or quarterly range of expected returns within which the system ideally performs. The fitness of the trading model is measured by the error of tracking this target range, the Return Path Error (RPE). Formally, the RPE is defined as RPE =

N−1 ∑ e 2 n=0 n

N,

where: N … number of periods (either calendar quarters or calendar months), n is the nth calendar period (indexed between 0 and N-1) en is the actual tracking error for the nth period. en is defined as follows: if (rn > RP+): en =(rn - RP+)W, if (rn < RP-): en = (rn - RP-), where rn is the calculated actual percentage return of the portfolio for the nth period, RP+ is the upper limit of the return path target range, RP- is the lower limit of the return path target ©1999 by CRC Press LLC

range and W is a weighting applied to smooth the effect of upside errors of the portfolio (typically 0.3 ≤ W ≤ 1.0; this model uses an error weighting of 0.4). The GA process seeks to minimise the RPE, which ideally equals zero, when the system performs completely within the desired return path. The advantage of using RPE as performance benchmark is that it emphasises and measures the consistency of performance in that it matches the user’s expectation on return with any risk-thresholds attached to the portfolio. If the return expected from the model is not compatible with the risk constraints placed upon the portfolio, this discrepancy can then be already detected during the learning process. Either the portfolio constraints or the performance expectations will then have to be adjusted. 12.4 APT Object Oriented Distributed Parallel Processing The underlying GA library, Evolving Programming Library (EPL), is a domain independent object oriented (OO) GA framework implemented in standard C++. The specification for a problem domain is achieved through deriving classes from the base class collection in the GA framework, implementing the desired functionality by overriding the appropriate virtual functions. The basic class of the EPL framework is CUserData, which - in the application - represents the problem domain. CUserData is implemented as an abstract base class, requiring the derived class to override exactly those member functions which are specific to the problem domain. Because of this clear OO design, the EPL is designed to handle all functions relating to the GA process without any domain specific adjustment to the GA code itself. The main virtual public functions of CUserData are: void CUserData::Copy ( const CUserData& _That ) // a virtual copy function. Avoids assignment operator overloading CUserData* CUserData::Clone (

void )

// returns a new object of correct run time type as copy of this object ©1999 by CRC Press LLC

// derived function contains return ( new CUserData_derived ( *this) ) void CUserData::Write (...) // writes the contents of the object into a binary stream void CUserData::Read (...) // reads the contents of this object from a binary stream TFitnessvalue CUserData::FitnessFunction ( void ) // performs the actual fitness calculation and returns fitness measurement // the return type TFitnessvalue is currently implemented as a typedef of double void CUserData::RegisterVar ( ... ) // registers the variables for optimisation with the underlying GA engine.

The implementation of such copy, cloning and stream functions does not implement a significant overhead to any application because these functions are normally required in some form for a standard OO architecture. The RegisterVar(...) function contains the registration of each variable prepared for optimisation with the underlying GA engine. Because the EPL framework defines the fitness function as pure virtual function, the GA process is completely shielded from the implementation of the problem domain. In OO terms, the implementation of the problem domain is entirely encapsulated. The ability to write to and read from binary streams is the basic requirement for the distributed learning process. By using streams, CUserData objects (i.e. trading model objects) can be exchanged between applications and workstations within a network. Because the fitness function is already defined within the base class, a client process can read the object from a stream (e.g. a file) and execute the fitness function, independent of the application containing the GA engine. This represents the basic design of a distributed parallel process based on a master/slave architecture. The APT framework extends the EPL library by implementing a base class trading model, CStrategyTmpl (short for class Strategy Template, although not implemented as template in C++ sense). ©1999 by CRC Press LLC

The CStrategyTmpl class provides a definition for the CUserData::FitnessFunction() member function, as well as for other member functions to implement the basic framework for the trading model to execute a strategy. This mainly includes the accounting functions as well as the database function. The trading model data access is implemented using a datafeed object that simulates a real-time datafeed by accessing the defined database and returning data to the trading model based on the defined time frames. The CStrategyTmpl base class provides virtual functions for any specific trading model implementation, which are called by the framework during execution of a trading model. Specifically these are functions called by the framework at specific points in time, such as begin of trading during a specific period, end of trading per period, end-of-day procedures. Currently implemented public virtual event functions: virtual void CStrategyTmpl::Before_Portfolio_AllMarkets ( void )// called as trading model initialisation virtual void CStrategyTmpl::Daily_BeginOfDay_AllMarkets ( void )// called at begin of each trading period ( trading day ) virtual void CStrategyTmpl::Daily_BeginOfDay_EachMarket ( int _MarketIndex )// called only for markets which are trading in this period ( today ) virtual void CStrategyTmpl::Daily_AllMarkets ( void )// called for all markets in a particular trading period ( day ) virtual void CStrategyTmpl::Daily_EachMarket (int _MarketIndex )// called only for markets which are trading in this period ( day )// this function contains the main routines for executing the trading// model during training and application process virtual void CStrategyTmpl::Daily_EndOfDay_EachMarket (int _MarketIndex ) // called for active markets at the end of the trading period ( day ) for each // market virtual void CStrategyTmpl::Daily_EndOfDay_AllMarkets ( void ) ©1999 by CRC Press LLC

// called at the end of each trading period ( day ) // the base class contains mainly accounting end-of-day functions, such as // portfolio evaluation and reporting functions virtual void CStrategyTmpl::After_Portfolio_AllMarkets ( void )// called after the trading model is completely evaluated

These functions are called by the framework in connection with the datafeed object. The APT framework controls these functions through a pointer to the trading model object of type of the base class CStrategyTmpl. Any specific trading model implementation will override these event-functions to implement the specific functionality required. This design makes the actual trading model implementation largely independent of the functionality provided by the framework. In addition to the GA based reporting functions of top and average fitness, and learning process performance, the APT framework also includes a standardised portfolio reporting object using standard portfolio performance measurements. Trading Models derived from CStrategyTmpl can be designed independently of the GA engine and can contain also nonoptimised parameters or rule sets. This article concentrates on the implementation of a GA based learning process. The distributed architecture is currently implemented based on file streams, which are locked/unlocked by the current process, which is either a slave or the master process, performing the genetic operators. Although the file based process creates a larger overhead 12.5 Learning Process Implementation The decision making process of the adaptive trading models simulates the human decision making process by learning behaviour patterns that are matched against patterns the system detects in the environment data. The behaviour of the trading model is therefore a result of events that take place in the environment - both being the market data and it’s own accounting database. ©1999 by CRC Press LLC

It is the purpose of the system’s training process to learn to detect what constitutes an event and what is the appropriate behaviour in response to it. Because behaviour in the context of a trading model is always a decision to sell, to buy, or to do nothing (keep the current position unchanged) in a particular financial instrument, it also means that risk for the portfolio may either be increased (by establishing a new net position in a market) or decreased (by reducing an open net position in any market)1. The trading model is designed to retrieve information from both external data (market prices and other data) and from its own trading performance to decide on its trading decisions. These trading decisions (and subsequently decisions on any open positions) are also subject to the risk threshold placed upon the system by the trading manager or system supervisor. Every trading decision has four components that simultaneously define the portfolio performance: market timing - price risk portfolio allocation - portfolio risk. Below, the learning process for each of the components, and the integration into a final trading decision are described. 12.5.1 Market Timing Decision The process of mapping environment data patterns to a trading behaviour pattern is implemented through two layers of objects that perform these functions: :a layer of objects performing calculations on data (Calculation Node Objects) and a layer of objects interpreting the results of calculations as events and mapping these events to possible decisions. Calculation Node Objects (CNOs) are an array of objects that retrieve data and perform mathematical, statistical and logical operations on these data, returning a set of numerical values that will be interpreted by another layer as events. It is through the GA process that it is decided for each CNO which data are retrieved and which operator is applied. The calculation that is performed within each CNO can be either very simple (such as a logical

1

Portfolio Risk is here referred to as the aggregated risk of all positions, not the variance of the portfolio returns itself

©1999 by CRC Press LLC

comparison { e.g. “>” or “=”} or a simple arithmetic operation {e.g. “+”, “/” ...} ), or it can involve complete statistical calculations, such as the historical volatility of a retrieved time series. Boolean values (true/false) are represented as numerical values (0 / 1). CNOs are not limited to retrieving input only from an external database. The output of each CNO can also be selected in the learning process as input for calculations. This allows the trading model to detect complex patterns by connecting a large number of CNOs that each perform very simple calculations. In an array of (n > 1) number of CNOs, the nth CNO can use the output of ( CNO[0] ... CNO[n-1]) as input. The actual number of Calculation Node Objects is a matter of trading model design and is largely dependent on available machine time, as increased complexity also dramatically increased the time required to train these trading models. The advantage of using an array of CNOs is the flexibility that otherwise would be restricted by a fixed length Genetic Algorithm. Event Node Objects (ENOs) are an array of objects that interpret output values of CNOs by learning to map the CNO output to a logical value of true/false in terms of an event occurring yes/no. Because the output of each CNO is a fixed structure of values that is always assigned valid data (independent of the calculation performed with a CNO), ENOs can always interpret the result of a calculation in terms of an event occurring or not. Event Node Objects also learn to map a particular CNO output set to a particular type of decision - to buy, sell or do nothing in a market. After both layers of objects have been constructed, the trading model possesses a pattern to interpret data it retrieves from the external database in terms of possible trading decisions. Each Event Node Object that - for the actual data - returns true for an event to exist, also returns a signal for a decision, to buy, to sell or to remain unchanged in the current position (if any). The trading model then chooses the one decision that occurs most frequently as ENO result, as a final decision to be made in the moment the calculation is performed. ©1999 by CRC Press LLC

The chart over the page illustrates the process described above. N/D indicates a decision result of “nothing done”, i.e. current position unchanged.

External Input Data (e.g. Market Prices) ↓



CNO[0]





CNO[1]









CNO[2 ]





CNO[3]





CNO[N]





Interpreting Output of Calculation Node Objects: Is Event (Yes/No) ? ↓







ENO[0 ]

ENO[1 ]

ENO[2 ]

ENO[N]









• Buy

• Buy

• Buy

• Buy

• Sell

• Sell

• Sell

• Sell

• N/D

• N/D

• N/D

• N/D

Interpreting Output of Event Node Object Array to retrieve a single decision signal ↓ Final Market Timing Decision: Buy / Sell / Nothing Done (Position Unchanged)

Figure 12.1 Buy/Sell decision flowchart.

12.5.2 Risk and Portfolio Allocation Decisions Risk and allocation decisions are learned differently by the trading model, as these decisions are not “event-driven”, but only depend on a position in a market to exist. ©1999 by CRC Press LLC

Learning market timing behaviour differs from the other components in the way that we have little knowledge on how buy and sell decision should be made. The system can and should therefore have a large degree of freedom to develop these decision making rules by learning to develop appropriate adaptive behaviour patterns. For risk and allocation decisions, we do however have some knowledge about possible rules (e.g. restricting risk exposure, diversifying portfolios), and we especially have certain constraints that we want to enforce on the trader / the system. Consequently, the trading system can have freedom to decide on allocation strategies and risk exposure within a portfolio, but it must do so by observing the global risk and allocation thresholds. The learning process for risk and allocation decision is a rule-based learning process that uses a “bottom-up” approach by combining simple rules (“sub-rules”) using mathematical and logical operators, resulting in a more complex decision rule, onto which certain risk thresholds are applied by the trading model. For the GA process, each sub-rule and each operator (mathematical or logical) is recognised as a simple integer value indexing the rule within the risk management object of the trading model. The GA learning process of the trading model creates a rule which is a formula consisting of data parameters (i.e. calculation results of the sub-rules) and the operators. Although the structure of the formula is limited in its flexibility compared to the market timing decision, the trading model can still create a relatively complex structure of rules very different from the simple components it is based on. Although we will focus here on the risk and allocation decision to be made as part of a new trading decision (resulting from a market timing signal), the same calculations are repeated at the end of each trading period (mostly end of trading day), in order to adjust existing open positions. These adjustments are particularly important because they make sure that the portfolio will always be maintained within the desired risk thresholds. Our testing has further found that the constant application of risk thresholds is one of the most significant contributing factors to creating a consistent and predictable performance pattern, largely independent of the actual market timing strategy used. ©1999 by CRC Press LLC

The trading model uses three levels of risk calculation: - expected risk (calculating an estimation of the risk a certain trading position carries) - accepted risk (a risk threshold which the trading model defines for each trading position) - portfolio risk threshold (the overall constraint that the system supervisor or trading manager puts on the entire portfolio and must be observed by the trading model). The expected risk is used by the trading model to make an initial decision on the size of a trading position, similar to a human trader. Through the learning process described below, the model is trained to improve its risk forecasting ability. The accepted risk is a threshold that is associated with a given market position, and is calculated by the trading model as part of the learning process. This is not necessarily a “worst-case-scenario” threshold. More likely, the trading model uses risk thresholds to actively manage the size of open positions in a market to create a less volatile equity curve. Portfolio risk thresholds are defined by the supervisor of the system, the trading manager. The trading manager normally has a clear view of the amount of risk he/she is prepared to accept for a given portfolio. This however is a threshold for the aggregated risk of all positions and the trading model learns to translate this global constraint into position limits for each individual component of the portfolio. The risk calculation for a portfolio is performed in three steps: price risk - portfolio allocation - portfolio risk. 12.5.3 Price Risk Calculation Price risk is associated with the market in which a position is taken and is independent of the size of a trading position. It is however closely linked to the type of market timing strategy developed by the trading model, since the expected variance of prices is a function of the time horizon of an investment1.

1

In absolute terms, variance of prices is dependent on the time horizon, not the variance itself. Assuming a normal distribution of price changes, both a short term trader and a long term trader face the same probability of a 1-σ event in the market. In absolute terms (and in percentage of the portfolio) these will be different values.

©1999 by CRC Press LLC

The rules which the trading model can learn to use for calculating risk use as input: • price volatility (in absolute terms over previous n number of days) • price volatility (in terms of standard deviation) • previous trading range (price high/low over previous n number of days) • maximum / minimum price changes (over previous n number of days) • average/total/minimum/maximum price risk calculations of others components of the portfolio • price and volatility correlation to other components of the portfolio The “sub-rules” are combined in a rule structure developed by the GA process that result in a risk estimate for a position in that market. Initially this risk estimate is also the maximum accepted price risk for the position. In trading terms, that price risk is the “Stop-Loss” level at which a position is closed out, because the market forecast, implied by the decision taken by the market timing strategy, is accepted to be wrong. After the initial position is established, trading models re-calculate price risk estimates every time the trading model is updated with new prices. The however the trading model does enforce to cut the position size, if the estimated price risk exceeds the accepted risk. The accepted risk is only allowed to increase by a limited margin when applied to an existing open position. 12.5.4 Portfolio Allocation Decision Portfolio allocation decides the percentage share of the overall portfolio value that is allocated to one portfolio component. Portfolio allocation within asset investment strategies is a defined process which could easily be implemented through any GA (or other) learning algorithm. The decision for a trading portfolio is more complex, because the system is not invested into all markets at the same time. The trading model must learn how to allocate a ©1999 by CRC Press LLC

share of the portfolio for a new position to be taken, but also taking into account existing positions and any portfolio limits placed onto the system. For any given portfolio component the allocated share of the portfolio is calculated by the trading model as N

Ai = Fi

Fn ∑ n =1

where Ai .....

allocation for the ith market

N .....

number of markets available to the portfolio

Fi .....

the function expressing the rule used to calculate the allocation

Each rule developed by the GA learning process is based on a combination of sub-rules which are based on the input of: • price data and time series statistics of the given market • time series statistics in relation to the other markets in the portfolio (cross-correlation and relative strength) • relative value of each markets performance within the portfolio The result of each calculation is always the re-calculation of allocation for the entire portfolio. This makes sure the trading model not only observes strictly any portfolio constraints, but it also ensures that the trading model can re-balance the entire portfolio after the addition of a new position to the portfolio. Since performance consistency is typically the most important criteria for evaluating trading models, the automatic re-allocation ensures the highest possible level of diversification within the portfolio. Market Selection is a special case of the portfolio allocation decision, as it is also possible for the system to allocate a share of zero to a market, thus effectively excluding a market from the portfolio. When setting up the adaptive trading model, a number of markets are made available to the system. However, the trading model can in every stage of the calculations decide to reject a certain market either by not creating a buy or sell decision or by ©1999 by CRC Press LLC

allocating a very small share of the portfolio to that market that cannot be traded in the market. 12.5.5 Portfolio Risk Decision After the trading model estimates the price risk associated with a position and calculates the share of the portfolio allocated to a particular trade, it can calculate the actual quantity to trade, i.e. the size of the trade and of the open position. In learning to perform this calculation, the trading model uses the calculated price risk, the allocated equity and all input data used by previous calculations to create the decision making rule for the trading quantity. The decision on the size of each position has shown to be among the most relevant decisions determining the consistency of portfolio performance. In general terms, the position size is a function of the available (allocated) equity for a given market and the price risk associated with the market (which the trading model both learns to calculate/estimate), i.e. U i = f ( Ai , Pi , R ) where Ui is the number of trading units (position size), for the ith market within the portfolio, Pi is the calculated Price Risk, for the ith market, Ai is the calculated allocation for the ith market and R is the portfolio risk accepted by the system expressed as a percentage of total capital accepted to be at risk. The above function can also be re-written as R actual = f ( Ai , Pi ,U i ) . In other words, the actual percentage of the portfolio at risk (Ractual) is a function of a market’s price risk, the allocated share of the portfolio and the number of units bought or sold in the market. For the purpose of the learning process, the Ractual value should be smaller or equal to the accepted portfolio risk value, R. The learning process of the trading model therefore first finds a percentage portfolio risk value R that is compatible with the global parameters of the portfolio, and then, using both other parameters to the function, Ai, Pi, calculates the actual number of trading units (Ui) that would create the desired exposure. ©1999 by CRC Press LLC

The calculation of the trading size is the link between the market timing behaviour developed by the system and the required risk management strategy. For every market timing decision made by the system, it will calculate the associated market risk (per unit price risk) and the associated portfolio allocation. It is through the variation of the trading size (position size) that the system is performing a trade-off between higher (lower) per unit price risk and smaller (larger) size of the trading position. Such an approach to managing the portfolio risk is in our view an essential component of a consistently developed trading strategy. It removes the uncertainty that is normally associated with profitable open positions; if profits should be kept “running” or positions liquidated in order to ensure that open profits are protected from any more risk. In order to develop consistent performance behaviour, the trading model must have the ability to manage constant risk exposure during changing portfolio composition and market events. This process enables the trading model to consistently balance the market price risk and the portfolio risk, by changing the actual position size it will have in any market. Increased risk per unit traded (price risk) can be matched by a decrease in number of units traded and vice versa. As with the calculation of portfolio allocation, portfolio risk management is always performed over the entire portfolio. Therefore, a change in one portfolio component can be matched by shifting (i.e. reducing) exposure in other markets. This allows the trading model to re-balance the portfolio either when a new market position is to be taken or when the daily mark-to-market process is performed. 12.6 Adaptive Process Design 12.6.1 Dynamic Parameter Optimisation Since the trading system is designed to learn behaviour patterns, the optimisation of fixed numerical parameters (such as the number of days over which historical volatility is calculated) should be avoided. Not only is such a parameter unlikely to be usable across different markets, it is also very dependent on the actual ©1999 by CRC Press LLC

distribution of prices within the training period. It must therefore be expected that the optimum parameter changes dramatically when market condition change. Using statically optimised parameters leads to inflexible curve-fitting which - in our previous research - has shown not to hold up in subsequent out-of-sample tests. The concept used by these trading models are “dynamic parameters”; i.e. parameters for which the value is not directly optimised, but for which the model learns to develop rules for calculation. This is done within the Calculation Node Objects. For any calculations that cannot be developed as rules within the trading model, a hard coded time frame of 200 data periods is used as starting point for calculations. Example: The system may learn to use a moving average of market prices as part of an estimate of near-term market direction. For this moving average, a parameter is needed for the number of days over which the calculation is performed. Rather than optimising this parameter directly (within a given range of possible values), the system learns to calculate the value based on another calculation, e.g. market volatility (calculated by a different Calculation Node Object). The systems retrieves current market volatility reading, and the min / max value for that calculation (over the time span calculated by the other CNO). The last reading is then expressed as ratio within the historical min/max values. If min = 8 and max = 20, a current volatility value of 15 would result in a ratio value of 0.58. The general formula is: Ratio = (Current - Min ) / (Max Min). This ratio is the current value expressed in percentage of the range. This percentage value is then applied to the min/max range of possible moving average values used by the system. Assuming we define possible moving average parameters between (10;100) periods, the actual moving average parameter used in this calculation would be ((100-10)*0.58) ≈ 52 period moving average. Note that this resulting value changes very time the underlying calculation data (here: volatility reading) changes. The benefit of dynamic parameter optimisation is that the learning process focuses on developing calculation rules rather than optimising parameters. Such rules can be applied to all markets, ©1999 by CRC Press LLC

since they will adjust automatically to new data by re-calculating the actual parameters used for calculation of events. 12.6.2 Cross-Validation Cross-Validation is an important tool in the evaluation of machine learning processes, dividing the available time series into periods for training and periods for testing, during which the performance of the trained parameters is validated. An adaptive process uses a continuos training and application process to learn behaviour and apply this behaviour on new data. In the Foreign Exchange portfolio demonstrated below, the following setup for training/evaluation periods is being used: Training (Learning)

Testing (Application)

From

From

To

To

Period A 09-05-1990 04-06-1993 07-06-1993

08-11-1994

Period B 09-05-1990 08-11-1994 09-11-1994

12-04-1996

Period C 05-09-1991 12-04-1996 15-04-1996

17-09-1997

Table 12.1 Setup and Training Evaluation Periods for the Foreign Exchange portfolio. The choice of 3 adaptation periods is largely arbitrary and depends mostly on the available machine time for performing the testing. A higher frequency of re-training and adaptation would create a more continuos adaptive behaviour. We have opted for 3 periods to keep the training of the portfolio within the parameters of our available computer resources. Each test period uses a defined training period for learning. The learning algorithm starts with a random initialisation (i.e. a state of no knowledge and random behaviour). An initial training period of 800 trading days (9 May 1990 to 4 June 1993) is assigned to the first testing period. During this period the system learns to develop basic rules and already eliminates a large number of consistently unsuccessful behaviour patterns. 800 trading days as an initial period represent about 1/3 of the database available. After that, a maximum amount of 1200 trading days is allowed for the training period to create similar training environments for each testing phase. ©1999 by CRC Press LLC

12.6.3 Top Fitness / Closest Fitness Behaviour Patterns After the training process, the trading model selects one single rule system to be applied to new data. Typically this is the rule set which had resulted in the optimum theoretical performance during the training phase. At the beginning of the application phase of each new period, the trading model will adapt its behaviour according to the rules it has learned during the training process. Since every learning process is an optimisation process, the system always carries the risk of overoptimisation during the learning process. Using overoptimised behaviour patterns on new data is very likely to result in undesirable, negative performance. We have therefore developed a concept of not using the optimised rule set for the actual trading period (“top-fitness” rule set), but to select one rule set, which is likely to be more robust in its real-time performance then a highly optimised behaviour, the “closest-fitness” rule set. To calculate the “closest-fitness rule set”, the trading model uses an internal threshold to find a range of behaviour rules, which resulted in acceptable performance during the training period, including the best strategy, i.e. the top-fitness rule set. Within this group of rule sets, the system then tries to find a smaller group with similar performance results. If such a group is found, the system selects the best rule set of this group to adapt to, for the new period. This rule set is referred to as the closest-fitness rule set. It may be the case that the selected closest-fitness rules set and the top-fitness rule set are identical, but more often this is not the case. Because this closest-fitness rule sets is not as highly optimised as the top-fitness behaviour, we have found a very significant increase in consistency of performance, when comparing the training results with the application periods. 12.6.4 Continuos Strategy Evaluation During a real-time application of the trading model, at the end of each actual period (when the system prepares to adapt new behaviour patterns), the trading model already has generated a stream of trading decisions, which have resulted in a profit or loss, and the system may also have open positions in any of the markets ©1999 by CRC Press LLC

of the portfolio. Although the division of the database into several training/application periods is necessary to create an adaptive learning process, it does not correctly reflect how the system would be applied in a real-time environment. To replicate real-time behaviour, the trading model has the ability to dynamically adapt new behaviour while keeping all existing open positions and existing accounting values. This creates a continuous performance measurement and allows to measure the effect, switches in the behaviour patterns would have on existing market positions. 12.6.5 GA Parameters These main parameters for the Genetic Algorithm are currently used uniformly across all learning processes: Crossover rate

0.6

Mutation Rate

0.001

Equal Parents Probability

0.5

Population Size

100

Elitist Selection

TRUE

Selection Method

Tournament Selection

Table 12.2 Main parameter set. The termination condition of each process is defined by a time stop. Because a higher number of generations is essential for an improved learning process, we have set a time stop only to let the system calculate as many generations as possible within acceptable computer resource requirements. A configuration as shown here requires one week of training on a distributed process using 4 workstations in order to achieve any level of acceptable results. Although the results by the trading models are profitable and relatively stable, the available hardware configuration has not yet generated a learning process of what we believe is sufficient depth. We have seen that a more demanding fitness target (such as the return path error) does indeed require far greater computational resources to be made available to the learning process. ©1999 by CRC Press LLC

12.7 Foreign Exchange Trading Portfolio Simulation We have concentrated on foreign exchange markets to create a portfolio of liquid (i.e. easily tradable), but only low correlated currency pairs. Foreign Exchange (FX) represents the worlds largest and continuos financial market with no restriction on buying or selling of currencies in most OECD economies. FX markets take a special role in portfolio trading and investing. FX rates are not asset prices (such as stocks, bonds or commodities), but represent - as a ratio - the relative purchasing power between the monetary base of two economies. As a consequence, FX markets lend themselves not very well to traditional asset allocation techniques. FX markets are a very good example of defining a trading strategy in terms of risk taking and risk aversion, which allow the system to fully exploit it’s risk management abilities during the development of the trading model. 12.7.1 Portfolio Specification GBP-USD

GBP-DEM

USD-CHF

DEM-JPY

USD-CAD

DEM-CHF

AUD-USD

Table 12.3 Currency Pairs Available for Trading. Database History: Daily Open/High/Low/Last Data 09-05-1990 through 17-09-1997 Data Source: Bride Information Systems ©1999 by CRC Press LLC

Portfolio Base Currency

US Dollar, Profits/Losses are converted to US$ at prevailing exchange rates as they are realised.

Individual Market Constraints

No internal restrictions on individual position size. Each market could be allocated any position size between zero and 100% of the available trading capital.

Global Portfolio Constraints

Maximum portfolio exposure must not exceed 3 times current portfolio value (including open positions evaluated at current market prices).

Transaction Costs

Each transaction is assumed to carry 0.1% of the price as transaction costs1. Swap costs/gains have not been included.

Return Path Specification

Quarterly Return Path of 3%-15%.

Drawdown Limit

A drawdown of 30% is considered a total loss on the portfolio. In other words, if at any time the trading model would lose 30% from the last equity peak, trading for this system is to be stopped.

Table 12.4 Trading Parameter/Risk Parameter Inputs.

Fitness Value

Reciprocal value of the Quarterly Return Path Error (see above)

Yield

Annualised compound yield of return

YieldDD

Ratio of Yield/Maximum Drawdown ever occurred in the trading model (measured on a daily basis)

P/M

Percentage of Months Profitable

P/Q

Percentage of Calendar Quarters Profitable (more important here because fitness value is based on quarterly return path optimisation)

No. Trades

Number of Trades during the relevant period (includes transactions which resulted in partial close-out of existing position due to risk management adjustment)

Table 12.5 Portfolio Performance Measurements

1

Typical Forex transaction costs may be around _ of this value, however, this assumption also allows for slippage, that is, an actual execution price worse than the desired trading price (e.g. due to volatile market conditions).

©1999 by CRC Press LLC

12.7 2 Performance Result - Overview

Period

Begin

End

Training

A

09-051990

04-061993

Application

A

07-061993

08-111994

Training

B

09-051990

08-111994

Application

B

09-111994

Training

C

Application Application

Fitness 245.74

Yield YieldDD

P/M

P/Q

No.Trades

14.19 %

2.95

72.97% 100.00%

81

39.83 3.26%

0.59

70.59%

66.67%

29

12.62 %

1.73

72.22%

88.89%

56

12-041996

37.17 5.01%

0.92

52.94%

50.00%

25

05-091991

12-041996

60.89 7.96%

1.00

67.27%

84.21%

106

C

15-041996

17-091997

22.17 -4.26%

-

41.18%

40.00%

36

All

07-061993

17-091997

31.28 0.51%

0.04

52.94%

47.06%

94

112.08

Table 12.6 Results of Selection of Top-Fitness Rule Set for Trading during Application Period Period

Begin

End

Training

A

09-051990

04-061993

Application

A

07-061993

08-111994

Training

B

09-051990

08-111994

Application

B

09-111994

Training

C

Application Application

Fitness 210.97

Yield YieldDD

P/M

P/Q

No.Trades

14.52 %

2.95

70.27% 100.00%

91

40.25 3.64%

0.64

70.59%

66.67%

31

12.64 %

1.74

72.22%

88.89%

56

12-041996

38.66 5.69%

1.20

52.94%

50.00%

22

05-091991

12-041996

58.25 7.54%

0.99

63.64%

73.68%

86

C

15-041996

17-091997

43.16 6.66%

1.00

52.94%

60.00%

18

All

07-061993

17-091997

42.12 5.11%

0.76

58.82%

64.71%

86

112.07

Table 12.7 Results of Selection of Closest-Fitness Rule Set It can be seen from above tables that although both methods of selecting the trading model during training have yielded positive ©1999 by CRC Press LLC

results, choosing the closest-fitness rule set has resulted in a more consistent, and also in absolute terms, more profitable strategy during all hold-out periods. It is interesting to note that during the training periods, both topfitness and closest-fitness rule sets have shown similar Yields but clearly different fitness values. This confirms in our view the importance of an appropriate type of fitness target for the learning process, as we believe the return path target is. Choosing the closest-fitness target has also yielded another desired result: increasing the predictability of portfolio performance itself, which is shown more detailed in the following tables. 12.7.3 Performance Result - Consistency across Training / Application Periods We measure the consistency across training / application sets by calculating the ratio between the performance measurement of the hold-out period over the measurement of the training period. A ratio of 1.00 would mean exactly the same performance, a ratio of > 1.00 means better performance. Although a higher real-time return compared to the training set would be desirable for practical reasons, it is not desirable for measuring the predictability of performance. Generally however, it must be assumed that real-time performance (or hold-out tests) perform considerably less profitable than the training results suggest.

PeriodA

Fitness Training Application

PeriodB 112.08

60.89

139.57

39.83

37.17

22.17

33.06

0.16

0.33

0.36

0.29

PeriodA Training Application Ratio

©1999 by CRC Press LLC

Average

245.74

Ratio

Yield

PeriodC

PeriodB

PeriodC

Average

14.19%

12.62%

7.96%

11.59%

3.26%

5.01%

-4.26%

1.34%

0.23

0.40

#N/A

0.31

PeriodA

Profitable Months

PeriodC

Average

Training

72.97%

72.22%

67.27%

70.82%

Application

70.59%

52.94%

41.18%

54.90%

0.97

0.73

0.61

0.77

Ratio

PeriodA

Profitable Quarters

PeriodB

Training Application Ratio

PeriodB

PeriodC

Average

100.00%

88.89%

84.21%

91.03%

66.67%

50.00%

40.00%

52.22%

0.67

0.56

0.48

0.57

Table 12.8 Top Fitness Rule Set - Training/Application Result Ratio

PeriodA

Fitness Training Application Ratio

Training Application Ratio

Months

58.25

127.10

40.25

38.66

43.16

40.69

0.19

0.34

0.74

0.43

PeriodB

PeriodC

Average

14.52%

12.64%

7.54%

11.57%

3.64%

5.69%

6.66%

5.33%

0.25

0.45

0.88

0.53

PeriodB

PeriodC

Average

Training

70.27%

72.22%

63.64%

68.71%

Application

70.59%

52.94%

52.94%

58.82%

1.00

0.73

0.83

0.86

Ratio

PeriodA

Profitable Quarters

Average

112.07

PeriodA

Profitable

PeriodC

210.97

PeriodA

Yield

PeriodB

Training Application Ratio

PeriodB

PeriodC

Average

100.00%

88.89%

73.68%

87.52%

66.67%

50.00%

60.00%

58.89%

0.67

0.56

0.81

0.68

Table 12.9 Closest Fitness Rule Set - Training/Application Result Ratio ©1999 by CRC Press LLC

Comparing the performance ratios training set / hold-out set shows that the closest-fitness parameter set increases the consistency and predictability of performance significantly. Calculating an average of all training/hold-out ratios, the average ratio for the top-fitness parameter set is 0.48, whereas the average ratio for the closest fitness set is 0.62. 12.8 Results of Other Portfolio Types We have tested the same trading model configuration on other types of portfolios, including various individual currency pairs (IAW a portfolio with just one component), and equity index portfolios. It has emerged that the diversity of the portfolio is an important contributing factor for the consistency of performance. Although a pure equity index portfolio had yielded better returns (in absolute terms) over the same period, the consistency of a number of performance measurements has been lower than in the FX model described here. Portfolios containing less markets (or just one single market) have generally performed less well than diversified portfolios, even when correlation among individual portfolio components is relatively high (is it is with major western stock markets, such as New York, London, Paris, Frankfurt). Although the positive effect of diversification on portfolio performance is known in portfolio theory, this would only partially explain the observed results: although the system may have a number of markets available for allocation of trades, the system does not keep positions in all low-correlated markets at the same time. The actual effect of diversification is therefore much less than the theoretical. It rather seems to be that the GA based model can learn better, i.e. more flexible, behaviour rules, if a larger amount of different data is available within the same learning period. Providing the system with more data of different types of distribution reduces the risk of over-optimisation on a specific type of price distribution. After having faced a more complex environment during the learning process, the trading model seems to be more capable in dealing with the new environment during hold-out periods. ©1999 by CRC Press LLC

12.9 Results of Different Fitness Targets and Variation of GA Parameters 12.9.1 Different Fitness Target Calculations Performance comparisons of fund managers or trading advisors typically use a number of different benchmark to analyse structure of risk and return. Most commonly used are the Sharpe Ratio, which measures risk as variance of returns relative to above riskfree returns, and various types of yield / drawdown calculations. We have not been able to develop any acceptable performance during hold-out periods when using these portfolio measurements as fitness targets. Although the training process itself typically yields very high values for these benchmarks (proving that the GA process is an effective search process), these results have generally not translated into appropriate hold-out period behaviour. It seems that the type of behaviour the trading model develops is very different for various types of fitness targets, although with hindsight any trading model delivering performance within a defined return path, will always have high ratings with other performance measurements. Consequently, we see the concept of optimising a trading model towards a minimum Return Path Error (RPE) as the most appropriate fitness target definition for the learning process. 12.9.2 Variation of GA Parameters Over a larger number of generations, we have not observed significant changes in the result of the learning process when adjusting GA parameters, except for the use of elitist selection, which seems to contribute significantly to the speed of the learning process. A slight advantage in the learning process has emerged using the tournament selection method, with the roulette-selection performing comparatively inefficient. The requirement of computer resources increases substantially when the number of portfolio components is increased. This is most likely due to the increased complexity of the solution space. It has however (as described before) a positive effect on the result of the portfolio simulation. ©1999 by CRC Press LLC

The fully scalable, distributed GA library on which the APT framework is based, provides for us the development environment to increase the performance of the learning process by adding additional hardware resources to the learning process. Conclusion It has been shown Genetic Algorithms lend themselves very well to complex decision making simulations, due to the parallel nature of the algorithm. GAs also allow the integration of very different types of learning processes into one optimisation process. The foreign exchange trading portfolio described here has yielded an acceptable level of risk/return and, compared to other portfolio tests, shows a clear direction for further development. By implementing all important requirements the human trader faces in the process of making trading decisions, the GA based learning process proves to be more stable and more efficient, thus leading to lower and more predictable risk for the portfolio management industry.

©1999 by CRC Press LLC