VAR Models and Applications Laurent Ferrara1 1 University
of Paris West
M2 EIPMC 2015
Overview of the presentation
1. Vector Auto-Regressions I I I
Definition Estimation Tests
2. Impulse responses functions (IRF) I I
Concept General IRF
3. Applications
Vector Auto-Regressions: Short introduction
I
The VAR models are widely used in economic analysis.
I
While simple and easy to estimate, they make it possible to conveniently capture the dynamics of multivariate systems.
I
VAR popularity is mainly due to Sims (1980) influential work.
Vector Auto-Regressions: Notations
I
Let yt denote an (n × 1) vector of random variables. yt follows a p th order Gaussian VAR if, for all t, we have yt = c + Φ1 yt−1 + . . . Φp yt−p + εt where εt ∼ N(0, Ω).
I
Consequently yt | yt−1 , yt−2 , . . . , y−p+1 ∼ N(c + Φ1 yt−1 + . . . Φp yt−p , Ω).
Vector Auto-Regressions: Exemple n = 2
VAR(1)for yt = (y1,t , y2,t ):
y1,t
= c1 + φ11 y1,t−1 + φ12 y2,t−1 + ε1,t
y2,t
= c2 + φ21 y1,t−1 + φ22 y2,t−1 + ε2,t .
where ε1,t ∼ GWN(σε21 ), ε2,t ∼ GWN(σε22 ) and ρ(ε1,t , ε2,t ) = 0 φ11 and φ22 are autoregressive coefficients, φ21 and φ12 are exogeneous coefficients.
Vector Auto-Regressions: MLE I
0 and Denoting with Π the matrix c Φ1 Φ2 . . . Φp 0 0 0 0 with xt the vector 1 yt−1 yt−2 . . . yt−p , the log-likelihood is given by L(YT ; θ) = −(Tn/2) log(2π) + (T /2) log Ω−1 T 0 i 1 Xh yt − Π0 xt Ω−1 yt − Π0 xt . − 2 t=1
I
ˆ is given by The MLE of Π, denoted with Π " Πˆ0 =
T X t=1
yt xt0
#" T X t=1
#−1 xt xt0
.
(1)
Vector Auto-Regressions: MLE Proof of equation (1) Let’s rewrite the last term of the log-likelihood T h X
0 i yt − Π0 xt Ω−1 yt − Π0 xt =
t=1 T X
ˆ 0 xt + Π ˆ 0 xt − Π0 xt yt − Π
0
Ω
−1
ˆ 0 xt + Π ˆ 0 xt − Π0 xt yt − Π
t=1 T X
0 −1 0 ˆ − Π) xt − Ω ˆ − Π) xt εˆt + (Π εˆt + (Π 0
t=1
where the j th element of the (n × 1) vector εˆt is the sample residual for observation t from an OLS regression of yjt on xt .
=
Vector Auto-Regressions: MLE
T h X
0 i yt − Π0 xt Ω−1 yt − Π0 xt =
t=1 T X
εˆ0 t Ω−1 εˆt + 2
t=1
T X
ˆ − Π)0 xt εˆ0 t Ω−1 (Π
t=1
+
T X t=1
ˆ − Π)Ω−1 (Π ˆ − Π)0 xt xt0 (Π
Vector Auto-Regressions: MLE
Let’s apply the trace operator on the second term (that is a scalar): ! T T X X ˆ − Π)0 xt = trace ˆ − Π)0 xt εˆ0 t Ω−1 (Π εˆ0 t Ω−1 (Π t=1
= trace
t=1 T X
! Ω
−1
0
ˆ − Π) xt εˆ0 t (Π
t=1
= trace
ˆ − Π)0 Ω−1 (Π
T X t=1
! xt εˆ0 t
Vector Auto-Regressions: MLE Given that, by construction, the sample residuals are orthogonal to the explanatory variables, this term is equal to zero. ˆ − Π)0 xt , we have If x˜t = (Π T h X
0 i yt − Π0 xt Ω−1 yt − Π0 xt =
t=1 T X t=1
εˆ0 t Ω−1 εˆt +
T X
x˜t0 Ω−1 x˜t
t=1
Since Ω is a positive definite matrix, Ω−1 is as well. Consequently, the smallest value that the last term can take is obtained when ˆ xt∗ = 0,ie when Π = Π.
Vector Auto-Regressions: MLE I
I
ˆ the MLE of is the matrix Assume that we have computed Π, ` ˆ that maximizes Ω → L(YT ; Π, ˆ Ω). Ω ˆ t , we have Denoting with εˆt the estimated residual yt − Πx ˆ L(YT ; Π,Ω) = −(Tn/2) log(2π) + (T /2) log Ω−1 T
−
1 X 0 −1 εˆt Ω εˆt . 2 t=1
I
ˆ is a symmetric positive definite matrix. Fortunately, it turns Ω out that that the unrestricted matrix that maximizes the latter expression is a symmetric postive definite matrix. Indeed, T T T 0 1X 0 1 X 0 ∂`(Ω) 0 ˆ = Ω − εˆt εˆt =⇒ Ω = εˆt εˆt . ∂Ω 2 2 T t=1
t=1
Vector Auto-Regressions: Likelihood-Ratio test
I
The simplicity of the VAR framework and the tractability of its MLE contribute to convenience of various econometric tests. We illustrate this here with the likelihhod ratio test.
I
The maximum value achieved by the MLE is ˆ −1 ˆ ˆ L(YT ; Π,Ω) = −(Tn/2) log(2π) + (T /2) log Ω T
−
1 X h 0 ˆ −1 i εˆt Ω εˆt . 2 t=1
Vector Auto-Regressions: Likelihood-Ratio test I
The last term is T X
ˆ −1 εˆt εˆ0t Ω
= trace
hP
T ˆ −1 εˆt ˆ0t Ω t=1 ε
i
t=1
" = trace
T X
# ˆ −1
Ω
εˆt εˆ0t
" ˆ −1
= trace Ω
t=1
T X
# εˆt εˆ0t
t=1
h i ˆ = Tn. ˆ −1 T Ω = trace Ω I
Therefore ˆ Ω) ˆ = −(Tn/2) log(2π) + (T /2) log Ω ˆ −1 − Tn/2. L(YT ; Π, which is easy to calculate.
Vector Auto-Regressions: Likelihood-Ratio test
I
I
For instance, assume that we want to test the null hypothesis that a set of variable follows a VAR(p0 ) against the alternative specification of p1 lags (with p1 > p0 ). Let us respectively denote with Lˆ0 and Lˆ1 the maximum log-likelihoods obtained withp0 and p1 lags. Under the null hypothesis, we have ˆ −1 ˆ −1 2 Lˆ1 − Lˆ0 = T log Ω − log Ω 1 0 which asymptotically has a χ2 distribution with degrees of freedom equal to the number of restrictions imposed under H0 (compared with H1 ), ie n2 (p1 − p0 ).
Vector Auto-Regressions: Criteria I
In a VAR, adding lags quickly consume degrees of freedom. If lag length is p, each of the n equations contains n × p coefficients plus the intercept term.
I
Adding lengths improve in-sample fit, but is likely to result in over-parameterization and affect the out-of-sample prediction performance.
I
To select appropriate lag length, some criteria can be used (they have to be minimized) AIC SBIC where N = n × p 2 + p.
2 ˆ = log Ω + N T log T ˆ = log Ω + N T
Vector Auto-Regressions: Granger Causality I
Granger (1969) developed a method to analyze the causal relationship among variables systematically.
I
The approach consists in determining whether the past values of y1,t can help to explain the current y2,t .
I
Let us denote three information sets I1,t
= {y1,t , y1,t−1 , . . .}
I2,t
= {y2,t , y2,t−1 , . . .}
It I
= {y1,t , y1,t−1 , . . . y2,t , y2,t−1 , . . .} .
We say that y1,t Granger-causes y2,t if E [y2,t | I2,t−1 ] 6= E [y2,t | It−1 ] .
Vector Auto-Regressions: Granger Causality I
I
To get the intuition behind the testing procedure, consider the following bivariate VAR(p) process: y1,t
= Φ10 + Σpi=1 Φ11 (i)y1,t−i + Σpi=1 Φ12 (i)y2,t−i + u1,t
y2,t
= Φ20 + Σpi=1 Φ21 (i)y1,t−i + Σpi=1 Φ22 (i)y2,t−i + u2,t .
Then,y1,t does not Granger-cause y2,t if Φ21 (1) = Φ21 (2) = . . . = Φ21 (p) = 0.
I
Therefore the hypothesis testing is ( H0 : Φ21 (1) = Φ21 (2) = . . . = Φ21 (p) = 0 HA : Φ21 (1) 6= 0 or Φ21 (1) 6= 0 or . . . Φ21 (p) 6= 0.
Vector Auto-Regressions: Granger Causality
I
Rejection of H0 implies that some of the coefficients on the lagged y1,t ’s are statistically significant.
I
This can be tested using the F -test or asymptotic chi-square test. I
I I
(RSS−USS)/p The F -statistic is F = USS/(T −2p−1) (where RSS: restricted residual sum of squares, USS: unrestriced residual sum of squares) Under H0 , the F -statistic is distributed as F (p, T − 2p − 1) In addition, pF → χ2 (p).
Vector Auto-Regressions: Granger Causality
See RATS example on the US-EA GDP growth relationships
Vector Auto-Regressions: Impulse responses I
Objective: analyzing the effect of a given shock on the endogenous variables.
I
Let the stationary VAR(p) system: yt = c + Φ1 yt−1 + . . . Φp yt−p + εt
I
Assume the system receives a shock at t: εt = δ
I
bf Definition A standard IRF (h, δ) describes the effects of the shock at date t + h compared to a zero-shock εt = 0, assuming that εt+h = 0 for all h > 0.
I
The Generalized IRF by Koop, Pesaran and Potter (1996): GIRF (h, δ, Ft−1 ) = E {yt+h |εt = δ; εt+h = 0, h > 0; Ft−1 } − E {yt+h |εt+h = 0, h ≥; Ft−1 }
Vector Auto-Regressions: Impulse responses
Exemple of a centered univariate AR(1) : xt = φxt−1 + εt . Assume xt−1 = 0, thus xt = εt = δ. IRF (1, δ) = E (xt+1 |εt = δ, εt+1 = 0, Ft−1 )−E (xt+1 |εt = εt+1 = 0, Ft−1 ) IRF (1, δ) = φδ IRF (2, δ) = φ2 δ.... IRF (h, δ) = φh δ Remark : IRF is proportional to the size of the shock and independent of past history
Vector Auto-Regressions: Impulse responses
I
Let us consider a stationary vector random variable yt that presents the following Wold’s decomposition: y t = εt +
∞ X
Ψj εt−j .
j=1 I
The hth impulse response function of the shock εt on yt , yt+1 , . . . is given by Ψh δ and vanishes as h → ∞
I
Formally, the impulse response of the shock εt on the variable y is defined as ∂yt+h = Ψh . ∂εt
Vector Auto-Regressions: Impulse responses Dynamics of yt , yt+1 , yt+2 , . . . when εt = 1, εt+1 = 0, εt+2 = 0, . . .
!! !" !#
"# !
"#
%$"#$&
"#$'
"#$( !
"#$)
"#$*
Vector Auto-Regressions: Unconditional variance I
The unconditional matrix of variance-covariance of yt is Var (y ) = lim E0 ((yt − y¯t )(yt − y¯t )0 ) t→∞
I
where y¯t denotes the unconditional mean of y . 0 0 0 . . . yt−p Let denote with yt∗ the vector yt0 yt−1 , we have c Φ1 Φ2 · · · Φp εt 0 1 0 0 ··· 0 ∗ ∗ yt = . + yt−1 + .. .. 0 . . . 0 0 . yt∗
0 0 0 ∗ ∗ = c + Φyt−1 + εt ∗
1
0
0
Vector Auto-Regressions: Unconditional variance
I
It is then easy to get the Wold’s decomposition of yt∗ :
∗ yt∗ = c ∗ + Φ c ∗ + Φyt−2 + ε∗t−1 + ε∗t = c ∗ + ε∗t + Φ(c ∗ + ε∗t−1 ) + . . . + Φk (c ∗ + ε∗t−k ) + . . . I
The ε∗t ’s being iid, we have Var (y ) = Ω + ΦΩΦ0 + . . . + Φk ΩΦ0k + . . .
Vector Auto-Regressions: Extensions
I
Let yt denote an (n × 1) vector of random variables. yt is Gaussian VAR(p) with exogeneous variables xt = (xt1 , . . . , xtm ) of dimension m, for all t yt = Φ1 yt−1 + . . . Φp yt−p + Cxt + εt where εt ∼ N(0, Ω) and
c11 . . . .. C = . cij cn1 . . .
c1m .. . cnm
Vector Auto-Regressions: Exemple n = 2
VAR(1)for yt = (y1,t , y2,t ):
y1,t
1 2 = φ11 y1,t−1 + φ12 y2,t−1 + c11 xt−1 + c12 xt−1 + ε1,t
y2,t
2 1 = φ21 y1,t−1 + φ22 y2,t−1 + c21 xt−1 + c22 xt−1 + ε2,t .
Vector Auto-Regressions: Extensions
I
Bayesian VAR
I
Non-linear VAR (Smooth-Transition VAR, Markov-Switching VAR)
I
Factor-Augmented VAR