Comparison of Parameter Estimation Methods in Cyclical ... - L. Ferrara

In this paper, we are interested in the study of cyclical time series with long ...... used in the case of small sample sizes, although statistical techniques, such as.
101KB taille 19 téléchargements 498 vues
Comparison of Parameter Estimation Methods in Cyclical Long Memory Time Series by * Laurent FERRARA and Dominique GUEGAN**

Abstract In this paper, we are interested in the study of cyclical time series with long range dependence. To analyse such time series, an useful tool is the generalized long memory process introduced in the statistical literature by Gray et al. (1989). Because of the presence of a singularity in its spectral density, this process possesses the ability to take into account in modelling the phenomenon of persistence, as well as a periodic cyclical behaviour. However, the issue of parameter estimation is not obvious. We discuss the estimation of the spectral density singularity location and we present in detail two types of long memory parameter estimation methods: a semiparametric one, based on the expression of the log-periodogram, and a pseudo-maximum likelihood one, based on the Whittle likelihood. Moreover, we give the asymptotic properties of each estimate. Lastly, we compare these two types of estimation methods in order to model the error correction term of a fractional cointegration analysis carried out on the Nikkei spot index data.

* Laurent Ferrara is Statistician and Researcher at the Centre d’Observation Economique (COE), (email: [email protected]). CCIP-COE, 27 Avenue de Friedland, 75008 Paris. ** Dominique Guégan is Professor of Statistics at University of Reims, UPESA 6056 (e-mail: [email protected]). Université de Reims - UFR Sciences, Moulin de la Housse, BP1039, 51100, Reims, France.

0

1. INTRODUCTION When dealing with financial time series, we are often confronted with the phenomenon of long memory or long-range dependence. For instance, evidence of long memory has been shown in time series of inflation rates (Baillie, Chung and Tieslau (1996)), stock market prices (Willinger et al. (1999)) or exchange rates (Ferrara and Guégan (2000a)). Moreover, many researchers have shown the empirical evidence of persistence in the conditional variance of diverse time series of asset returns. This slow decay of the sample autocorrelation function (ACF) of the squared (or absolute) returns has been pointed out by, for instance, Granger and Ding (1996) or Baillie, Bollerslev and Mikkelsen (1996). However, some time series exhibit a periodic cyclical pattern as well as long-range dependence. Instead of removing this cyclical pattern before the time series long memory analysis, it seems more efficient to take into account these both components in modelling, and the generalized long memory process appears to be an useful parametric model to tackle this issue. In this paper, we recall some properties of this kind of process, we consider specifically the parameter estimation issue and we give an application on Nikkei spot index data. A covariance stationary process (Xt)t∈Z is said to be long memory if the infinite sum ∑|γ(k)| diverges, where γ denotes the autocovariance function of the process. When working with long memory processes, it is often useful to embed in the spectral domain. We recall that the spectral density fX of a covariance stationary process is the Fourier transform of the autocovariance function, i.e.: for λ ∈ [0,2π[, fX(λ) = (1/2π) ∑ γ (k) exp(-iλk).

(1.1)

In the remaining of this paper, we consider the following very useful definition of the long memory property, established in the spectral domain: a stationary process is said to be long memory if for some frequency λG ∈ [0, π] its spectral density fX becomes unbounded. A standard semiparametric model for a local behaviour of the spectral density in the neighbourhood of the frequency λG is: fX(λ) ∼ C |λ-λG|-2d ,

(1.2)

where C is a finite positive constant and where d lies on the open interval ]-1/2,1/2[. The parameter d controls the memory component in the neighbourhood of λG. Indeed, if d = 0, the process is said to have short memory, if d > 0, the process is said to have long memory and if d < 0, the process is said to be antipersistent. Moreover, d < 1/2 and d > -1/2 ensure respectively the stationarity and the inversibility of the process. Note that another interesting semiparametric asymmetric model for fX has been recently proposed by Arteche and Robinson (2000).

1

Different authors proposed a parametric model for the spectral density verifying equation (1.2). If λG = 0 or π, the most popular parametric model is the autoregressive fractionally integrated moving-average (ARFIMA) process, introduced by Granger and Joyeux (1980) and Hosking (1981). If λG ∈ ]0, π[, the most general parametric model is the k-factor GARMA process, introduced by Gray et al. (1989, 1994), and discussed in the papers of Robinson (1994b), Giraitis and Leipus (1995), Chung (1996a, 1996b), Yajima (1996), Hosoya (1997), Woodward et al. (1998) and Ferrara and Guégan (2000a, 2000b). In the statistical literature, this process is also known as the generalized long memory process, because it includes as special cases the ARFIMA process and the diverse seasonal long memory processes studied by Porter-Hudak (1990), Ray (1993), Hassler (1994) or Sutcliffe (1994). Let the process (Xt)t∈Z be defined by the following equation: k

∏ ( I − 2ν

i

B + B 2 ) di ( X t − µ ) = ε t ,

(1.3)

i =1

where k is a finite integer, where |νi| ≤ 1 for i=1,…,k, where di is a fractional number for i=1,…,k, where µ is the mean of the process, where B is an operator defined on (Xt)t∈Z such that: BXt = Xt-1 and BbXt = Xt-b for b>0, where I is the identity operator defined on (Xt)t∈Z such that : IXt = Xt and where (εt)t∈Z is a covariance stationary process with finite variance σε2. Under the assumption of inversibility, a solution of equation (1.3) is given by:

X t = µ + ∑ ψ j ( d , ν )ε t − j ,

(1.4)

j ≥0

where: ψ j (d ,ν ) =

∑C

l1 0≤ l1 ,...,lk ≤ j , l1 +...+ lk = j

( d1 ,ν 1 )...C lk ( d k ,ν k ) ,

(1.5)

where (Cj(d, ν))j ∈ Z are the Gegenbauer polynomials defined by the following recursion formula, for j ≥ 2,: Cj(d, ν) = 2ν((d-1)/j + 1)Cj-1(d, ν) – (2(d-1)/j + 1)Cj-2(d, ν) ,

(1.6)

where C0(d, ν) = 1 and C1(d, ν) = 2dν. See, for instance, Rainville (1960) for further theoretical aspects on Gegenbauer polynomials. If di is a fractional number, for i=1,…,k, the process (Xt)t∈Z defined by equation (1.4) is called a k-factor Gegenbauer process, whether (εt)t∈Z is a white noise process, or a k-factor GARMA process whether (εt)t∈Z is a covariance stationary short memory ARMA type process. Note that it is common to define a k-factor GARMA process by equation (1.3), however in that case, we mean the

2

solution given in equation (1.4). In the remainder of the paper, we assume µ=0 and a kfactor GARMA process, defined in equation (1.4), will be denoted GARMAk(p,d,ν,q) (or GGk(d,ν) if p = q = 0) where d is the k-vector (d1,…,dk)t, ν is the k-vector (ν1,…,νk)t and p and q are respectively the orders of the AR and MA parts of the process. A description of the main statistical properties of a generalized long memory process can be found in the papers of Gray et al. (1989), Robinson (1994b), Giraitis and Leipus (1995), Chung (1996a, 1996b), Woodward et al. (1998) and Ferrara and Guégan (2000a, 2000b). Especially, for i=1,…,k, whether 0 < di < 1/2, with |νi| < 1 or whether 0 < di < 1/4, with |νi| = 1, thus the k-factor Gegenbauer process (Xt)t∈Z defined by equation (1.4) is a stationary, causal and invertible long memory process. Moreover, recall that the spectral density of a k-factor GARMA process is given by the following equation: k

f X (λ ) = f ε (λ )∏ 4 sin( i =1

λ + λi λ − λi ) sin( ) 2 2

−2 d i

,

(1.7)

where 0 ≤ λ ≤ π, where fε is the spectral density of the covariance stationary process (εt)t∈Z and where, for i=1, …,k, λi = cos-1(νi) are called the Gegenbauer frequencies (Gfrequencies). Thus, the spectral density of a k-factor GARMA process clearly exhibits k peaks on the interval [0,π] (see, for instance, Ferrara (2000) for simulations). Note also that forecasting performances of generalized long memory processes are discussed in Ferrara and Guégan (2000a, 2000b). In this paper, we focus our attention on parameter estimation in k-factor Gegenbauer processes. It seems that parameter estimation of the autoregressive and moving-average parts, in the case of a k-factor GARMA process, does not represent a specific issue. In the second section, we present the semiparametric estimation method based on the expression of the log-periodogram. We derive the analytic expression of the logperiodogram estimate of the long memory parameter, in the case of a general process with a single singularity on the interval, and we give its limiting distribution. Then, we describe how to use this estimate in order to estimate the long memory parameter of a GARMAk(p,d,ν,q) process. In the third section, we present some results relevant to the pseudo-maximum likelihood parameter estimation method of a GARMAk(p,d,ν,q) process. Especially, we consider the Whittle estimation method, often used in the case of an ARFIMA process, and we show how to use this method in practice. The last section contains an application of generalized long memory processes to a fractional cointegration analysis carried out on the Nikkei spot index data.

3

2. SEMIPARAMETRIC (SP) ESTIMATION In this section, we consider semiparametric (SP hereafter) estimation of the long memory parameter in generalized long memory processes. Especially, we focus on the log-periodogram method introduced by Geweke and Porter-Hudak (1983) and improved by Robinson (1995), and we adapt it to the case of generalized long memory processes. Let (Xt)t∈Z be a second order covariance stationary process, with spectral density having a singularity somewhere on the interval [0,π]. We assume that this singularity is known and is located at the frequency λG, denoted the Gegenbauer frequency (G-frequency). We discuss at the end of this section the case of an unknown frequency. Thus, the spectral density fX of the process is given by equation (1.1), as λ tends to λG. By evaluating equation (1.1) at the Fourier frequencies λj = 2π(j-1)/T, for j = 1,...,T, and by taking the logarithms, we get the following relationship, in the neighbourhood of the Gfrequency taken among the Fourier frequencies, λjG = 2πjG / T,: log( IT(λj) ) ∼ K - 2 log|λj-λjG| + log( IT(λj) / fX(λj) ),

(2.1)

where K is a finite constant and where IT(λj) is the periodogram evaluated at the Fourier frequency λj, defined by: 1 I T (λ j ) = 2πT

T

∑e

2 iλ j t

Xt .

(2.2)

t =1

Now, we consider a generalization of the Robinson estimate (1995), obtained from the asymptotic regression equation (2.1). This generalized Robinson estimate is explicitly given by the following equality, for T > m > l ≥ 0: jG + m

d$ R (l , m) =

∑ (Y

j

− Y ) log( I T ( λ j ))

j = j G + l +1

(2.3)

jG + m

∑ (Y

j

− Y )2

j = j G + l +1

where, for j ∈ [jG+l+1, jG+m], Yj = -2 log| λj-λjG | and Y is the empirical mean, where l is the number of frequencies trimmed from the regression equation (2.1), and where m is the bandwidth. Now, we are interested in exhibiting the asymptotic properties of d$R (l , m) . We precise some assumptions on the process (Xt)t∈Z. Assumption 2.1 As λ → λG, (λG+ , whether λG ∈ [0,π/2] or λG- , whether λG ∈ [π/2,π]), there exist a strictly positive real C and a real d lying on the interval ]-1/2,1/2[, such that: fX(λ) = C |λ-λG|-2d + O(|λ-λG|2-2d)

4

Assumption 2.2 In the neighbourhood of λG, the spectral density fX can be differentiated with respect to λ and as λ → λG, |d fX(λ) / dλ| = O(|λ-λG|1-2d) Assumption 2.3 The process is Gaussian Assumption 2.4 Let the bandwidth m and the trimming number l be such that, as T → ∞,: m → ∞, l → ∞, 5 4 m / T → 0, log2(T) / m → 0, and l / m → 0 , m1/2 log(m) / l → 0. Assumptions 2.1 to 2.4 have been made by Robinson (1995) to get the limiting distribution of his estimate in the case of a spectral density with singularity located at zero. Assumptions 2.1 and 2.2 are technical assumptions, concerning the smooth behaviour of the spectral density in the neighbourhood of the singularity λG on the interval [0,π]. Assumption 2.3 is the strongest assumption and implies that we have to work with Gaussian processes. In practice, note that this assumption is sometimes doubtful, especially when one studies financial time series, which are often leptokurtic. Assumption 2.4 is a technical assumption which means that m tends to the infinity with T, but slower, and l tends to the infinity with m, but slower. In order to adapt the Robinson's estimate, we add the following fifth assumption: Assumption 2.5 Let j be the integer related to the Fourier frequency λj, such that, as T tends to infinity: j → ∞, and j / T → 0. This assumption means that the integer j tends to infinity with T, but slower, and thus, it allows us to reach the G-frequency. This assumption represents the main change in comparison with the classical assumptions of Geweke and Porter-Hudak (1983) or Robinson (1995) for which the integer j is fixed. Thus, we get the following proposition:

5

Proposition 2.1 Let be a covariance stationary process, with spectral density defined by expression (2.1). Under assumptions 2.1 to 2.5, as T → ∞, we have: (2.4) 2 m ( d$ R (l , m) − d ) → N ( 0, π ² 6 )

Proof: The proof of Proposition 2.1 consists in proving that the result of Robinson (1995, Theorem 3) applies. Without loss of generality, we assume that the known G-frequency lies on the interval [0,π/2]. We are dealing with the Fourier frequencies λj = 2π(j-1)/T, for j=1,...,T. As T tends to infinity, thus, according to the Assumption 2.5, j tends also to infinity, but slower. Therefore, there always exist m and l, such that: m > l ≥ 0 and λj ∈ VjG(l,m) = [jG+l+1, jG+m]. For all Fourier frequency λj lying on the interval VjG(l,m), |λj - λjG| → 0+, as T → ∞. Thus, we show that the assumptions of Robinson (1995, p.1055) are proved, and therefore the asymptotic result (Theorem 3 of Robinson (1995)) applies. •

Now, we consider the parametric case of a stationary invertible long memory k-factor Gegenbauer process, denoted (Xt)t∈Z, for which the k G-frequencies (λ1,…,λk)t are known. We are interested in the estimation of the vectorial parameter (d1,…,dk)t. According to the spectral density definition of a GGk(d,ν) process (equation (1.4)), we clearly see that, as λ → λi, for i=1,…,k, the spectral density fX(λ) becomes unbounded (when 0 < di < 1/2). Moreover, it can be shown (see for instance Giraitis and Leipus (1995)) that the spectral density fX(λ) of a GGk(d,ν) process can be approximated as follows, as λ → λi, for i=1,…,k,: fX(λ) ∼ C |λ - λi|-2di , (2.5) Moreover, under the technical assumptions 2.4 and 2.5, the spectral density of a Gaussian GGk(d,ν) process verifies all the other assumptions (assumptions 2.1, 2.2 and 2.3). Therefore, by using the local approximation of the spectral density (equation (2.5)) for each G-frequency, we get an estimate of each long memory parameter di, for i=1,…,k, and by applying Proposition 2.1, we get the limiting distribution of each estimate. Although this estimation method provides the limiting distribution of each long memory parameter, we have to point out that the asymptotic joint distribution is still unknown under general conditions. However, assuming the independence of the long memory parameter estimates, one could get the Gaussian limiting distribution. As noted previously, this semiparametric method of estimation requires the knowing of the G-frequency. However, in practice, the G-frequency parameter has to be estimated, excepted in the case of the seasonal long memory processes for which the G-frequencies

6

correspond to the seasonal harmonics. Generally, the estimated G-frequency is taken as the argument that maximizes the raw periodogram. Yajima (1996) proves, under smooth conditions on the spectral density, that the argument maximizing the periodogram converges in probability towards the true G-frequency. Moreover, it can be shown (see Ferrara (2000)), under the assumption of G-frequencies estimates independence, that the frequencies corresponding to the k periodogram local maxima converge in probability towards G-frequencies of a k-factor GARMA process. From a practical point of view, one of the most difficult problem of this semiparametric method is the choice of the trimming number l and the choice of the bandwidth m. As in many others semiparametric statistical methods, the choice of the interval where the parametric method applies is not very clear. Some theoretical efforts have been made recently on this topic in the case of a G-frequency equal to zero, see for instance Hurvich et al. (1998) and Hurvich and Deo (1999). An empirical choice, often used in practice, is l = 0 and m = [T α], with α ∈ [0.5, 0.8]. Now, we consider the estimation of the scale parameter and short memory parameters of a k-factor GARMA process. Usually, this estimation can be carried out after the long memory parameter SP estimation. Indeed, if we note (Yt)t∈Z the prefiltered series such that: k

Yt = ∏ ( I − 2νB + B 2 )

d$R ,i

Xt ,

(2.6)

i =1

where (Xt)t∈Z is a k-factor GARMA process defined by equation (1.4), thus we have: φ ( B )Yt = θ ( B )ε t ,

(2.7)

where φ(B) = 1 - φ1 B - φ2 B² -...- φp Bp , θ(B) = 1 + θ1 B + θ2 B² +...+ θq Bq, and the roots of φ(B) and θ(B) lie outside the unit circle. Therefore, the (p+q) short memory parameters and the scale parameter can be estimated by using classical methods inherent in the Box and Jenkins (1976) methodology. Note that, in order to compute the prefiltered series (Yt)t∈Z, this estimation method involves a truncation of the infinite sum in equation (2.6). However, it is important to highlight that, in this case, no theoretical results on the estimates are available. In the case of ARFIMA processes, many authors have shown that SP estimation methods of the long memory parameter are not robust to the presence of AR and MA parts. Thus, this parameter estimation method has to be carried out very carefully. It is well known in the statistical spectral theory (see for instance Priestley (1981)) that the periodogram IT, used in the regression equation (2.1), is a non-consistent estimate of the spectral density of a covariance stationary process, although asymptotically unbiased. Two classical statistical techniques are available in order to try to improve the performances of the periodogram as a spectral density estimate, namely tapering and

7

smoothing. We introduce quickly these both techniques, but we refer to Priestley (1981) for further details. Tapering allows to improve the periodogram precision through a data transformation prior to the analysis. Actually, the sample X1,...,XT is replaced by the tapered sample h1X1,...,hTXT, where (ht)t=1,...,T is a suitable sequence of constants. As example of sequence, we often use in practice a cosine bell sequence, given for t = 1,...,T, by: ht = 0.5 {1 - cos(2πt/T)}.

(2.8)

Sometimes a trapezoidal sequence is used, which is equal to one for the central part of the series and decreases linearly towards zero for the t0 first and last values of the series. The practitioner chooses the value of t0, but usually t0 is such that: t0 = 0.1 × T. The smoothed periodogram, denoted fS, is given by the following expression: 1 r (2.9) f S (λ ) = ω (h / r )γ$ T (h)e − ihλ , ∑ 2π h =− r where γ$ T (h) is the empirical autocovariance function, where the real positive parameter r is a function of T such that if T → ∞, thus r → ∞ and r/T → 0, and where ω(.) is the lag window and is an even piecewise continuous function satisfying the conditions: ω(0) = 1; |ω(x)| ≤ 1, for all x; and ω(x) = 0, for |x| > 1. Note that if ω ≡ 1 and if r = T, thus we get 2πfS(λ) = IT(λ), for all Fourier frequency λ = λj ≠ 0. Several types of lag window are available in the statistical literature, proposed by famous statisticians (Bartlett, Parzen, Blackman-Tukey, Daniell, ...). However, in practice, two special types of lag window are often used: 1. The rectangular or truncated window, defined by: ω(x) = 1 if |x| ≤ 1; and 0 otherwise. 2. The Bartlett or triangular window, defined by: ω(x) = 1-|x| if |x| ≤ 1; and 0 otherwise. These two types of lag window are used in the last section, in order to compute the generalized Robinson estimate given by equation (2.3).

8

3. PSEUDO MAXIMUM LIKELIHOOD (PML) ESTIMATION We are now interested in parameter estimation of a GGk(d,ν) process defined by equation (1.4), with mean µ equal to zero, by using the pseudo-maximum likelihood (PML hereafter) estimation method, developed by Whittle (1951). If the process is not centered, the mean can be estimated by the empirical mean. Indeed, Chung (1996b) proved that the convergence rate of the empirical mean of a GARMA process, with |ν| < 1, is the usual rate O(T -1/2), when the convergence rate of the empirical mean of an ARFIMA process is O(T -1/2+d) (see for instance Adenstedt (1974)). In the remaining of this paper, we assume that the number k of factors is known, and moreover, for sake of simplicity, we assume that k=1. Let X1, …, XT be an observed finite sequence generated by a Gaussian linear causal stationary invertible GG1(d,ν) process (Xt)t∈Z. The Whittle estimate of ξ=(d,ν,σε2)t is obtained by minimizing the following approximation of the log-likelihood function: LW ( X , ξ ) =

1 2π



π

−π

(log( f X ( λ , ξ )) +

I T (λ ) dλ , f X (λ , ξ )

(3.1)

where fX(λ,ξ) is the spectral density of the process (Xt)t∈Z and IT(λ) is the periodogram defined by equation (2.2). To get the Whittle estimate ξ$ T = (d$T , ν$ T , σ$ ε ,T ) t , we have to minimize LW(X,ξ). After a reparametrization (see for instance Beran (1994)), we proceed in two stages: 1) We obtain θ$ T = (d$T , ν$ T ) t by minimizing, with respect to θ, the function: π I T (λ ) U T (θ ) = T ∫ dλ , (3.2) −π f (λ ,θ ) X where: k λ − λi λ + λi (3.3) f X ( λ , θ ) = ∏ 4 sin( ) sin( ). 2 2 i =1 2) Then, we get

σ$ ε2 , T

, such that: σ$ ε2 ,T =

U T (θ$ T ) . T

(3.4)

We make precise now the range to which parameters belong: Assumption 3.1 Let θ and σ$ ε2 , T be such that 0 < σ$ ε2 , T < ∞ and θ ∈ Θ = D × (]-1, cos(λG) - ε[ ∪ ]cos(λG) + ε,1[), where λG is the G-frequency and where D = [δ, 1/2-δ], with δ such as: 0 < δ < 1/4, for some ε > 0.

9

We denote θ0=(d0,ν0)t and σ2ε,0 the true values of respectively θ and σ2ε. First, we show the strong consistency of ξ$ T = (d$T , ν$ T , σ$ ε ,T ) t .

Proposition 3.1 If (Xt)t∈Z is a Gegenbauer process defined by equation (1.4) with k = 1, and if θ = (d,ν)t, then, under the Assumption 3.1: lim T →∞ θ$ T = θ 0 a.s. and lim T →∞ σ$ 2ε ,T = σ 2ε ,0 a.s..

Proof: The result is obtained by the same way as Theorem 1 in Hannan (1973). Giraitis and Leipus (1995) have already given this result in the case of a k-factor Gegenbauer process.• We get now the limiting distribution of d$T and σ$ ε ,T .

Proposition 3.2 If (Xt)t∈Z is a Gegenbauer process defined by equation (1.4) with k = 1, if θ = (d,ν)t, and if 0 < d < 1/4, then, under the Assumption 3.1, as T → ∞,: T 1/ 2 (d$T − d 0 ) → N (0, Λ d ) , T 1/ 2 (σ$ ε2 ,T − σ ε2 ,0 ) → N (0,2σ ε4 ,0 ) .

Proof: To prove Proposition 3.2, we extend the proof of Yajima (1985), obtained in the case of an ARFIMA process. We refer to Ferrara (2000) for the details of the proof. •

It is worthwhile to note that the convergence rate of this pseudo-maximum likelihood estimate is greater than the convergence rate of the semiparametric estimate presented in the previous section. In many statistical problems for which a singularity is observed, it seems that the discontinuity localisation parameter possesses a rate of convergence greater than the rates of convergence of the other parameters. In the case of Gegenbauer process, this phenomenon has been highlighted by Chung (1996a, 1996b), who notes that the convergence rate of the conditional sum of squares (CSS) estimate of the Gfrequency is O(T -1). Thus, we make the following conjecture:

10

Conjecture 3.1 If (Xt)t∈Z is a Gegenbauer process defined by equation (1.4) with k = 1, and if θ = (d,ν)t, then, under the Assumption 3.1: (ν$ T − ν 0 ) = O(T −1 ) . The limiting distribution of the Whittle estimate ν$ T is a difficult issue, and, our knowledge has, the limiting distribution has not been exhibited. This is mainly due to the unboundness of the spectral density, which therefore does not fulfil some regularity assumptions often used in classical Central Limit Theorems. From a practical point of view, we get the Whittle estimate by using a discrete approximation of UT , given in equation (3.2). This expression is replaced by a sum across the Fourier frequencies λj, for j = 1,...,n*, where n* is the integer part of T/2 + 1. Thus, as the continuous Whittle estimate, the discrete Whittle estimate is computed in two steps: 1) We obtain θ$ T = (d$T , ν$ T ) t by minimizing, with respect to θ, the function: n* I T (λ j ) , (3.5) QT (θ ) = ∑ j =1 f X ( λ j , θ ) where: k λ j − λ iG λ j + λ iG f X ( λ j ,θ ) = ∏ 4 sin( ) sin( ), (3.6) 2 2 iG =1 where the λ iG are the G-frequencies among the Fourier frequencies. 2) Then, we get

σ$ ε2 , T

, such that: σ$ ε2 ,T =

4π QT (θ$ T ) . T

(3.7)

In practice, parameter estimation in statistical long memory models having spectral density with singularities is done in two steps (Gray et al. (1989), Chung (1996a, 1996b) and Woodward et al. (1998)). The first step consists in a grid-search procedure to estimate the frequencies in which the spectral density is unbounded (parameter ν) and in a second step, the memory parameter d is estimated by using a classical parametric method. In the case of a Gegenbauer process, Yajima (1996) proposes to estimate first the frequency of unbounded spectral density by maximizing the periodogram. This method can be generalized to the case of a spectral density with several singularities (see Ferrara (2000)). Other authors consider a simultaneous global estimation of the whole of the parameters (Giraitis and Leipus (1995), Ferrara (2000)). The 2-steps method of

11

Yajima (1996) and the simultaneous method have the great advantage to avoid a gridsearch procedure, which is a very time consuming method. Moreover, it has been shown, through Monte-Carlo simulations, that the 2-steps procedure of Chung (1996a) and the simultaneous procedure provide quasi-similar results (see Ferrara (2000)).

4. APPLICATION In this section, we provide an application of the long memory Gegenbauer process, which points out the interest of such a model to analyze time series with long-range dependence. This application is motivated by the observed persistent sinusoidal decay of the ACF of the error correction term in cointegration analysis carried out on financial data. We refer especially to the paper of Barkoulas et al. (1997), dealing with interest rates of industrialized countries, and also to the paper of Lien and Tse (1999), dealing with the Nikkei spot index data. In both applications, fractional cointegration is considered by using a long memory ARFIMA process in order to model the long-range dependence of the error correction term, without taking into account the persistent cyclical behaviour. We propose here to reconsider the analysis of the error correction term by using a Gegenbauer process. Our data set stems from the one used in the paper of Lien and Tse (1999) and consists of T=1064 daily observations of the spot index and futures prices of the Nikkei Stock Average 225 (NSA), covering the period from May 1992 through August 1996. Daily closing values of the spot index and the settlement price of the futures contracts are used. The regular futures contracts mature in March, June, September and December.

Figure 1: Error correction term (Zt)t from May 1992 through August 1996.

12

The contracts expire on the third Wednesday of the contract month. For further details on the futures prices series, we refer to the paper of Lien and Tse (1999). Let (St)t denote the logarithm of the spot prices and (Ft)t the logarithm of the futures prices. Lien and Tse (1999) assume that (St)t and (Ft)t are both integrated of order one and they model the relationship between (St)t and (Ft)t with the following error correction model (ECM) proposed by Engle and Granger (1987): p

q

i =1

j =1

∆S t = φ 0 + ∑ φ i ∆S t −i + ∑ψ j ∆Ft − j + γZ t −1 + ε St ,

(4.1)

where (Zt)t may be approximated by the basis defined as Zt = Ft - St, for t=1,...,T. Note that the error correction term (Zt)t is simply the difference between the log futures and log stock prices. Our aim in this section is to model the error correction term (Zt)t presented in Figure 1. To reach our goal, we consider two different approaches of modelling and we compare competitively diverse long memory models and diverse estimation methods, according to their goodness of fit. To assess the goodness of fit, we use the three following information criteria: AICC = T log(σˆ ε2,T ) + T2−T1−( (pp++qq++δδ) ) ,

BIC = T log(σ$ ε2 ,T ) + ( p + q + δ ) log(T ) , HIC = T log(σ$ ε2 ,T ) + 2( p + q + δ )c log(log(T )) , where c=1.0001, where p and q are respectively the degrees of the AR and MA parts and where δ is equal to the sum of long memory parameters to estimate. Thus, δ = k if the G-frequencies are known and δ = 2k if the G-frequencies have to be estimated. Note that we propose these generalizations of information criteria by extending the definitions given in the long memory domain (see Gray et al. (1989) or Bisaglia (1998)).

Figure 2: ACF empirical estimation of the error correction term (Zt)t.

13

In the first approach, we model the long-range dependence of (Zt)t with a classical ARFIMA model (model (M1)), this case is well known in the statistical literature and is referred to as "fractional cointegration" (Granger (1986)). In the second approach, we model both the persistence and the cyclical patterns in the ACF of (Zt)t (see Figure 2 and 3) with a Gegenbauer process. Parameter estimation will be done by using either the simultaneous PML method introduced in previous sections (model (M2)) or a "twosteps" PML methods (model (M3) and model (M4)). Moreover, SP estimations of the long memory parameter are carried out, by using diverse spectral density estimation.

4.1 ARFIMA modelling In order to take into account the long range dependence of the error correction term (Zt)t, we consider first an ARFIMA(1,d,0) model, as in the paper of Lien and Tse (1999). Parameter estimation is done by using the classical Whittle's method. We obtain the following results: (I+0.0705B)(I-B)0.2756 (Zt-mZ) = εt , (M1) where mZ is the empirical mean of the time series (Zt)t, equal to 3.261 10-3. When considering the ACF of the error correction term (Zt)t (Figure 2), there is evidence of a persistent cyclical pattern, which cannot be caught by an ARFIMA model. Moreover, the spectral density of (Zt)t (Figure 3) clearly possesses a peak for a frequency located very close, but not equal, to zero. Thus, these both observations of the ACF and the spectral density suggest us to fit a Gegenbauer process to this series (Zt)t.

Figure 3: Spectral density empirical estimation of the error correction term (Zt)t.

14

4.2 Gegenbauer modelling In a first step, we estimate the degree of persistence, by using the semiparametric logperiodogram estimation method. The G-frequency estimate is taken such as the corresponding periodogram value is maximum. Thus, λ$ MG = 0.10039, i.e.: cos( λ$ M ) = 0.9950. Long memory parameter estimation is carried out by using the logG

periodogram method introduced previously and by considering diverse spectral density estimation, based on the expression of the periodogram. We use nine various estimation techniques provided by the combinations of the two smoothing methods and the two tapering methods presented in section 2. The bandwidth m is chosen such that m = T 0.6, and the trimming number l is supposed to be zero, according to the results on finite sample properties (see for instance Ferrara (2000)). The results are contained in Table 1.

Table 1 Semiparametric log-periodogram estimation of the memory parameter d.

Estimation Method Raw Raw + Cosine Taper Raw + Trapez. Taper Smooth. Rect. Smooth. Triang. Smooth. Rect. + Cosine Taper Smooth. Rect. + Trapez. Taper Smooth. Triang. + Cosine Taper Smooth. Triang. + Trapez. Taper

d$R (l , m) 0.15608 0.12754 0.12161 0.14864 0.16659 0.15230 0.15442 0.16685 0.16906

We note that the results are quite similar, except for the raw periodogram combined with the tapering method, for which the estimate is lower. We also note that the two tapering techniques do not imply a significant difference in the estimate value. However, smoothing with the triangular window implies a greater estimate than smoothing with the rectangular window. It is also clear that smoothing tends to increase the estimate value. From our practical experience, it seems that smoothing improves greatly the quality of the log-periodogram estimate if the sample size is small (less than 100 data points) and generally improves the robustness of the method. In a second step, we fit a Gegenbauer process to the series (Zt)t, and we estimate the parameter of the process by using the PML method. However, we compare three various methods of parameter estimation. First, parameter estimation is done using the simultaneous Whittle's method. The choice of initial values in the estimation algorithm

15

is crucial, and we take as initial values d(0) = 0.135 and λG(0) = 0.10. The value of 0.10 is chosen because the length of the period in Figure 4 seems to be around 60, which corresponds to a Gegenbauer frequency approximately equal to 0.10. Regarding the value of d(0), as the G-frequency tends to zero, the Gegenbauer process GG(d,ν) tends to an ARFIMA(0,2d,0) and therefore we take d(0) equal to the half of the estimated long memory parameter in model (M1). Thus, we obtain the following model for the series (Zt)t: (1-(2×0.99491)B+B2)0.12739 (Z t - mZ) = εt , (M2) with λ$ G = 0.10096. Finally, following the idea of Chung (1996a), parameter estimation is done using a "two-steps" method. Two different methods are considered. First, we assume the Gfrequency is estimated by the frequency corresponding to the periodogram maximum ( λ$ G = 0.10039), and we estimate the long memory parameter d by using the classical Whittle's method. Thus, we get the following model: (1-(2×0.99497)B+B2)0.10996 (Z t - mZ) = εt ,

(M3)

Secondly, we consider a grid-search procedure, and for each λG belonging to the interval [0.0401,0.1400], we estimate the long memory parameter d by using the classical Whittle's method. The couple of parameters for which the residual variance σ$ ε2,T is minimum is retained. Thus, we obtain the following model for the series (Zt)t: (1-(2×0.99496)B+B2)0.11480 (Z t - mZ) = εt ,

(M4)

with λ$ G = 0.1004. It is worthwhile to note that the G-frequency estimate obtained by the grid-search procedure is very close from the frequency for which the periodogram is maximum. The information criteria of each long memory model are presented in Table 2. From Table 2, it is clear that Gegenbauer processes (model (M2), model (M3) and model (M4)) provide a better fit than the ARFIMA process (model (M1)). We point out here the flexibility of the Gegenbauer process, which allows unboundness for the spectral density anywhere on the interval [0,π]. Thus, even when the G-frequency is very close to zero, a Gegenbauer process seems to be more efficient in fitting than an ARFIMA process.

16

Table 2: Goodness of fit results on the error correction term (Zt)t. Model

σ$ 2ε ,T

AICC

BIC

HIC

1.751 10

-5

-11650

-11640

-11646

Gegenbauer (M2) 1-step estimation

0.908 10

-5

-12349

-12339

-12345

Gegenbauer (M3) 2-steps estimation (1)

0.903 10-5

-12356

-12344

-12351

Gegenbauer (M4) 2-steps estimation (2)

0.905 10-5

-12352

-12342

-12348

ARFIMA (M1)

Regarding the comparative performances of parameter estimation methods in Gegenbauer processes, there is no strong difference in values of criteria. However, it turns out that the "2-steps" procedure, with the G-frequency estimated by the value maximizing the periodogram (model (M3)), provides a better goodness of fit than the two other methods. In practice, the major drawback of the grid-search procedure (model (M4)) is the request of CPU time, which increases with the estimate precision. Therefore, from a practical point of view, we advise the successively use of the simultaneous Whittle's procedure and the "2-steps" procedure with the G-frequency estimated by the value maximizing the periodogram. However, when using the simultaneous Whittle's method of estimation, it is worthwhile to note that a carefully choice of initial values in the estimation algorithm is needed. If we compare the estimation results on the long memory parameter provided by the SP method and the ones provided by the PML method, this latter method gives lower estimate values. However, the parameter values of the PML method are very close to the ones of the SP method using the tapered periodogram, without smoothing. It turns out that tapering can be a very helpful statistical technique for this estimation issue. As the convergence rate of the PML memory parameter estimate is greater than the convergence rate of the SP method, we believe that the PML estimates are more reliable, especially in the case of small sample sizes. Moreover, in the case of small sample sizes it has been empirically shown (see for instance Ferrara (2000)), that the simultaneous use of smoothing and tapering clearly improve the quality of the estimate.

17

5. CONCLUSION We show on an application to financial data how we can improve the goodness of fit, by taking into account in modelling the persistent periodic cyclical behaviour of a time series, through generalized long memory processes. We point out here the interest of such processes in comparison with the classical long memory ARFIMA process, even when the G-frequency is close to zero, which often happens in practice. The distinctive feature of generalized long memory processes is to possess a spectral density with k singularities lying on the interval [0,π], referred to as the G-frequencies. Therefore, the location of these G-frequencies is a parameter to estimate, excepted in the case of seasonal processes, for which the location of the G-frequencies is known. In this paper, we are especially interested in the issue of long memory parameter estimation. We compare two types of estimation methods: a semiparametric method based on the log-periodogram and a pseudo-maximum likelihood method, based on the Whittle likelihood. The semiparametric method requires first the estimation of the Gfrequencies, and we use as estimates the values for which the periodogram is maximum. Moreover, we give the asymptotic behaviour of each estimate. The semiparametric estimate has the great advantage to be easily computed, but possesses nevertheless a slow convergence rate. Thus, this estimate must be carefully used in the case of small sample sizes, although statistical techniques, such as smoothing and tapering, can greatly improve the semiparametric estimate performances in such cases. Moreover, as in the case of ARFIMA processes, the presence of a short memory component may damage the estimate quality. The pseudo-maximum likelihood estimate seems to be generally more reliable, under the assumption that the process is well specified. As future research, the development of nonparametric estimates, such as variance type estimate (see Giraitis et al. (1998)), and other semiparametric estimates, especially based on summing the periodogram across the discrete Fourier frequencies (see Robinson (1994a)), are of potential interest. Moreover, the estimates presented in this paper should be reconsidered in the case of cyclical long memory process with heteroscedasticity (see Guégan (1999, 2000)).

Acknowledgements: The authors wish to thank Yiu Kuen Tse for providing the data and Allan Timmermann for helpful comments.

18

REFERENCES Adenstedt R. K. (1974), "On large-sample estimation for the mean of a stationary random sequence", Annals of Statistics, 6, 1095-1107. Arteche J. and P. M. Robinson (2000), "Semiparametric inference in seasonal and cyclical long memory processes", Journal of Time Series Analysis, 21, 1, 1-25. Baillie R. T, T. Bollerslev and H. -0. Mikkelsen (1996), "Fractionally integrated generalized autoregressive conditional heterosckesdasticity", Journal of Econometrics, 74, 3-30. Baillie R. T., C. -F. Chung and M. A. Tieslau (1996), "Analysing inflation by the fractionally integrated ARFIMAGARCH model", Journal of Applied Econometrics, 11, 23-40. Barkoulas J., C. F. Baum and G. S. Oguz (1997), "Fractional cointegration analysis of long term interest rates", International Journal of Finance, 9, 2, 586-606. Beran J. (1994), Statistics for Long-Memory Processes, Chapman and Hall, London. Bisaglia L. (1998), Processi a memoria lunga : problemi di stima, identificazione e previsione, Dottora di Ricerca in Statistica, Ciclo X, Universita degli Studi di Padova. Box G. E. P. and G. M. Jenkins (1976), Time Series Analysis: Forecasting and Control, 2nd ed., Holden-Day, San Francisco. Chung C. -F. (1996a), "Estimating a generalized long memory process", Journal of Econometrics, 73, 237-59. Chung C. -F. (1996b), "A generalized fractionally integrated ARMA process", Journal of Time Series Analysis, 17, 2, 111-40. Engle R. F. and C. W. J. Granger (1987), "Co-integration and error correction: Representation, estimation and testing", Econometrica, 55, 251-276. Ferrara L. (2000), Processus Longue Mémoire Généralisés : Estimation, Prévision et Applications, Thèse de Doctorat, Université Paris 13 -RATP. Ferrara L. and D. Guégan, D. (2000a), "Forecasting financial time series with generalized long memory processes", chapter 14, 319-42, C.L. Dunis [ed.], Advances in Quantitative Asset Management, Kluwer Academic Publishers. Ferrara L. and D. Guégan, D. (2000b), "Forecasting with k-factor Gegenbauer processes", to appear in Journal of Forecasting. Geweke J. and S. Porter-Hudak (1983), "The estimation and application of long-memory time series models", Journal of Time Series Analysis, 4, 221-38. Giraitis L. and R. Leipus (1995), "A generalized fractionally differencing approach in long memory modeling", Lithuanian Mathematical Journal, 35, 65-81. Giraitis L., P. M. Robinson and D. Surgailis (1998), "Variance-type estimation of long-memory", Preprint, London School of Economics. Granger C. W. J. (1986), "Developments in the study of cointegrated economic variables", Oxford Bulletin of Economics and Statistics, 48, 213-28. Granger C. W. J. and R. Joyeux (1980), "An introduction to long-memory time series models and fractional differencing", Journal of Time Series Analysis, 1, 15-29. Granger C. W. J. and Z. Ding (1996), "Varieties of long memory models", Journal of Econometrics, 73, 61-77. Gray H. L., N. -F. Zhang and W. A. Woodward. (1989), "On generalized fractional processes", Journal of Time Series Analysis, 10, 233-57. Gray H. L., N. -F. Zhang and W. A. Woodward. (1994), Correction to "On generalized fractional processes", Journal of Time Series Analysis, 15, 561-62. Guégan D. (1999), "Note on long memory processes with cyclical behavior and heteroscedasticity", Working paper, University of Reims, France, 99-08, 1-21. Guégan D. (2000), "A new model: The k-factor GIGARCH process", Journal of Signal Processing, 4, 3, 265-71. Hannan E. J. (1973), "The asymptotic theory of linear time-series models", Journal of Applied Probabilities, 10, 130-45. Hassler U. (1994), "(Mis)specification of long memory in seasonal time series", Journal of Time Series Analysis, 15, 1, 19-30. Hosking J. R. M. (1981), "Fractional differencing", Biometrika, 68, 1, 165-76. Hosoya Y. (1997), "A limit theory of long-range dependence and statistical inference in related model", Annals of Statistics, 25, 105-37.

19

Hurvich C. M., R. Deo and J. Brodsky (1998), "The mean-squared error of Geweke and Porter-Hudak's estimates of the memory parameter of a long memory time series", Journal of Time Series Analysis, 19, 1, 19-46. Hurvich C. M. and R. Deo (1999), "Plug-in selection of the number of frequencies in regression estimates of the memory parameter of a long memory time series", Journal of Time Series Analysis, 20, 3, 331-41. Lien D. and Y. K. Tse (1999), "Forecasting the Nikkei spot index with fractional cointegration", Journal of Forecasting, 18, 259-73. Porter-Hudak S. (1990), "An application to the seasonal fractionally differenced model to the monetary aggregates", Journal of the American Statistical Association, 85, 410, 338-44. Priestley M. B. (1981), Spectral Analysis of Time Series, Academic Press, New York. Rainville E. D. (1960), Special Functions, Mac Millan, New York. Ray B. K. (1993), "Long-range forecasting of IBM product revenues using a seasonal fractionally differenced ARMA model", International Journal of Forecasting, 9, 255-69. Robinson P. M. (1994a), "Semiparametric analysis of long memory time series", Annals of Statistics, 22, 515-39. Robinson P. M. (1994b), "Efficient tests of non stationary hypotheses", Journal of the American Statistical Association, 89, 1420-37. Robinson P. M. (1995), "Log-periodogram regression of time series with long range dependence", Annals of Statistics, 23, 1048-72. Sutcliffe A. (1994), "Time-series forecasting using fractional differencing", Journal of Forecasting, 13, 383-93. Whittle P. (1951), Hypothesis Testing in Time Series Analysis, Hafner, New York. Willinger W., M. S. Taqqu. and V. Teverovsky (1999), "Stock market prices and long range dependence", Finance and Stochastics, 3, 1-13. Woodward W. A., Q. C. Cheng and H. L. Gray (1998), "A k-factor garma long-memory model", Journal of Time Series Analysis, 19, 5,485-504. Yajima Y. (1985), "On estimation of long-memory time series models", Australian Journal of Statistics, 27, 3, 30320. Yajima Y. (1996), "Estimation of the frequency of unbounded spectral densities", Discussion Paper, Faculty of Economics, University of Tokyo.

20