Chapter 13 Integrated Models for Nonstationary Data
We have assumed that \(x_t\) is stationary so far. What if \(x_t\) is not stationary in the mean? Use differencing to make a transformation stationary in the mean.
Autoregressive Integrated Moving Average model
- ARIMA(p,d,q) – p is AR order, d is differencing order, and q is MA order
- \(\phi(B)(1-B)^dx_t = \theta(B)w_t\) where \(w_t \sim ind. N(0,\sigma_w^2)\)
- Let \((1-B)^dx_t = y_t.\) Then \(\phi(B)y_t = \theta(B)w_t\) is an ARMA(p,q) model.
- The “integrated” name results from transforming back from the \(y_t\) to \(x_t\) by “integrating” (put together) or “summing” the \(y_t\)’s. With first differences, we have \[\sum_{k=1}^{t}y_k=\sum_{k=1}^{t}x_k-x_{k-1}=x_t-x_0=x_t\] if \(x_0=0\)
Because higher order differencing is the result of continuing to apply first differencing, the same process would be done in those cases.
Example 13.1 ARIMA(1,1,1) with \(\phi_1=0.7, \theta_1=0.4, \sigma_w^2=9, n=200\) (arima111_sim.R)
\(\phi(B)(1-B^d)x_t=\theta(B)w_t,\) where \(w_t \sim ind.N(0.9)\)
This can be rewritten as
\[(1-\phi_1B)(1-B)x_t=(1+\theta_1B)w_t\\ \iff (1-\phi_1B)(x_t-x_{t-1})=(1+\theta_1B)w_t\\ \iff x_t-x_{t-1}-\phi_1x_{t-1}+\phi_1x_{t-2}=w_t+\theta_1w_{t-1}\\ \iff x_t=(1+\phi_1)x_{t-1}-\phi_1x_{t-2}+w_t+\theta_1w_{t-1}\]
Using the above representation with only \(x_t\) on the left side, one could use the for loop
to simulate observations from this model. Instead, one can use arima.sim()
to do it as follows,
#Data could be simulated using the following code - notice the use of the order option.
set.seed(6632)
<- arima.sim(model = list(order=c(1,1,1), ar=0.7, ma=0.4), n=200, rand.gen=rnorm, sd=3) x
Notice the addition of the order
option to specify p, d, and q.
I had already simulated observations from the model in the past and put them in the comma delimited file. I am going to use this data for the rest of the example.
<- read.csv(file = "arima111.csv")
arima111 head(arima111)
## time x
## 1 1 -143.2118
## 2 2 -142.8908
## 3 3 -138.0634
## 4 4 -133.5038
## 5 5 -132.7496
## 6 6 -132.2910
tail(arima111)
## time x
## 195 195 -469.1263
## 196 196 -476.6298
## 197 197 -483.2368
## 198 198 -483.9744
## 199 199 -488.2191
## 200 200 -488.4823
<- arima111$x x
#Plot of the data
# dev.new(width = 8, height = 6, pointsize = 10)
par(mfrow = c(1,1))
plot(x = x, ylab = expression(x[t]), xlab = "t", type =
"l", col = "red", main = expression(paste("ARIMA
model: ", (1 - 0.7*B)*(1-B)*x[t] == (1 + 0.4*B)*w[t])),
panel.first = grid(col = "gray", lty = "dotted"))
points(x = x, pch = 20, col = "blue")
#ACF and PACF of x_t
#dev.new(width = 8, height = 6, pointsize = 10)
par(mfcol = c(2,3))
acf(x = x, type = "correlation", lag.max = 20, ylim =
c(-1,1), main = expression(paste("Estimated ACF plot
for ", x[t])))
pacf(x = x, lag.max = 20, ylim = c(-1,1), xlab = "h",
main = expression(paste("Estimated PACF plot for ",
x[t])))
#ACF and PACF of first differences
acf(x = diff(x = x, lag = 1, differences = 1), type =
"correlation", lag.max = 20, ylim = c(-1,1), main =
expression(paste("Estimated ACF plot for ", x[t],"-",x[t-1])))
pacf(x = diff(x = x, lag = 1, differences = 1), lag.max
= 20, ylim = c(-1,1), xlab = "h", main =
expression(paste("Estimated PACF plot for ", x[t],"-",x[t-1])))
#True ACF and PACF for ARIMA(1,0,1) (without differences)
plot(y = ARMAacf(ar = 0.7, ma = 0.4, lag.max = 20), x =
0:20, type = "h", ylim = c(-1,1), xlab = "h", ylab =
expression(rho(h)), main = "True ACF for
ARIMA(1,0,1)")
abline(h = 0)
plot(x = ARMAacf(ar = 0.7, ma = 0.4, lag.max = 20, pacf
= TRUE), type = "h", ylim = c(-1,1), xlab = "h", ylab
= expression(phi1[hh]), main = "True ACF for
ARIMA(1,0,1)")
abline(h = 0)
#Plot of the first differences
#dev.new(width = 8, height = 6, pointsize = 10)
par(mfrow = c(1,1))
plot(x = diff(x = x, lag = 1, differences = 1), ylab =
expression(x[t] - x[t-1]), xlab = "t", type = "l", col
= "red", main = "Plot of data after first
differences", panel.first = grid(col = "gray", lty =
"dotted"))
points(x = diff(x = x, lag = 1, differences = 1), pch =
20, col = "blue")
Notes:
- The \(x_t vs. t\) plot shows characteristics of a nonstationary in the mean time series.
- The ACF plot shows very large autocorrelations. Remember that this is a characteristic of a nonstationary in the mean time series.
- After first differences, the ACF and PACF look like the ACF and PACF from an ARMA(1, 1) with \(\phi_1 = 0.7\) and \(\theta_1 = 0.4\). The plot of the first differences themselves now look like a sample from a stationary process.
- While we have not talked about how to estimate model parameters, we can still take a quick look at what if the parameters are estimated. The
arima()
function in R can do it.
# Fit model
<- arima(x=x, order=c(1,1,1))
mod.fit mod.fit
##
## Call:
## arima(x = x, order = c(1, 1, 1))
##
## Coefficients:
## ar1 ma1
## 0.6720 0.4681
## s.e. 0.0637 0.0904
##
## sigma^2 estimated as 9.558: log likelihood = -507.68, aic = 1021.36
These estimates are relatively close to the values used in arima.sim()
!
#Covariance matrix
$var.coef mod.fit
## ar1 ma1
## ar1 0.004060990 -0.003341907
## ma1 -0.003341907 0.008175261
#Test statistic for Ho: phi1 = 0 vs. Ha: phi1 ≠ 0
<- mod.fit$coef[1]/sqrt(mod.fit$var.coef[1,1])
z z
## ar1
## 10.54508
2*(1-pnorm(q = z, mean = 0, sd = 1))
## ar1
## 0
#Test statistic for Ho: theta1 = 0 vs. Ha: theta1 <> 0
<- mod.fit$coef[2]/sqrt(mod.fit$var.coef[2,2])
z z
## ma1
## 5.177294
2*(1-pnorm(q = abs(z), mean = 0, sd = 1))
## ma1
## 2.251269e-07
# confidence interval
confint(object = mod.fit, level = 0.95)
## 2.5 % 97.5 %
## ar1 0.5470940 0.7968949
## ma1 0.2909019 0.6453306
#Shows how to use the xreg option
# dev.new(width = 8, height = 6, pointsize = 10)
arima(x = x, order = c(1, 1, 1), xreg = rep(x = 1, times = length(x)))
##
## Call:
## arima(x = x, order = c(1, 1, 1), xreg = rep(x = 1, times = length(x)))
##
## Coefficients:
## Warning in sqrt(diag(x$var.coef)): NaNs produced
## ar1 ma1 rep(x = 1, times = length(x))
## 0.6720 0.4681 -243.5336
## s.e. 0.0637 0.0904 NaN
##
## sigma^2 estimated as 9.558: log likelihood = -507.68, aic = 1023.36