いっかくのデータサイエンティストをいく

1からプログラミングとデータサイエンスを独習したい

【時系列】VAR(R:モデル作成)

前回の記事でVARの理論を紹介しましたが、今回はVARをRで実装してみたいと思います。 今回はこちらのサイトとほぼ同じことをやります。

tjo.hatenablog.com

logics-of-blue.com

パッケージと分析手順

RでVARモデルを実装する場合、{vars}というパッケージを利用します。

library(vars)

{vars}パッケージでVARモデルを実装する流れは次の通り。 1.次数をVARselect()関数で推定 2.求めた次数をVAR()関数に代入してVAR(p)モデル係数を推計

データ

今回はこのパッケージに入っている「Canada」というデータを使用します。

data(Canada)
Canada
#中身
               e     prod       rw     U
1980 Q1 929.6105 405.3665 386.1361  7.53
1980 Q2 929.8040 404.6398 388.1358  7.70
1980 Q3 930.3184 403.8149 390.5401  7.47
1980 Q4 931.4277 404.2158 393.9638  7.27
1981 Q1 932.6620 405.0467 396.7647  7.37
1981 Q2 933.5509 404.4167 400.0217  7.13
1981 Q3 933.5315 402.8191 400.7515  7.40
1981 Q4 933.0769 401.9773 405.7335  8.33
1982 Q1 932.1238 402.0897 409.0504  8.83
1982 Q2 930.6359 401.3067 411.3984 10.43
1982 Q3 929.0971 401.6302 413.0194 12.20
1982 Q4 928.5633 401.5638 415.1670 12.77
1983 Q1 929.0694 402.8157 414.6621 12.43
1983 Q2 930.2655 403.1421 415.7319 12.23
1983 Q3 931.6770 403.0786 416.2315 11.70
1983 Q4 932.1390 403.7188 418.1439 11.20
1984 Q1 932.2767 404.8668 419.7352 11.27
1984 Q2 932.8328 405.6362 420.4842 11.47
1984 Q3 933.7334 405.1363 420.9309 11.30
1984 Q4 934.1772 406.0246 422.1124 11.17
1985 Q1 934.5928 406.4123 423.6278 11.00
1985 Q2 935.6067 406.3009 423.9887 10.63
1985 Q3 936.5111 406.3354 424.1902 10.27
1985 Q4 937.4201 406.7737 426.1270 10.20
1986 Q1 938.4159 405.1525 426.8578  9.67
1986 Q2 938.9992 404.9298 426.7457  9.60
1986 Q3 939.2354 404.5765 426.8858  9.60
1986 Q4 939.6795 404.1995 428.8403  9.50
1987 Q1 940.2497 405.9499 430.1223  9.50
1987 Q2 941.4358 405.8221 430.2307  9.03
1987 Q3 942.2981 406.4463 430.3930  8.70
1987 Q4 943.5322 407.0512 432.0284  8.13
1988 Q1 944.3490 407.9460 433.3886  7.87
1988 Q2 944.8215 408.1796 433.9641  7.67
1988 Q3 945.0671 408.5998 434.4844  7.80
1988 Q4 945.8067 409.0906 436.1569  7.73
1989 Q1 946.8697 408.7042 438.2651  7.57
1989 Q2 946.8766 408.9803 438.7636  7.57
1989 Q3 947.2497 408.3287 439.9498  7.33
1989 Q4 947.6513 407.8857 441.8359  7.57
1990 Q1 948.1840 407.2605 443.1769  7.63
1990 Q2 948.3492 406.7752 444.3592  7.60
1990 Q3 948.0322 406.1794 444.5236  8.17
1990 Q4 947.1065 405.4398 446.9694  9.20
1991 Q1 946.0796 403.2800 450.1586 10.17
1991 Q2 946.1838 403.3649 451.5464 10.33
1991 Q3 946.2258 403.3807 452.2984 10.40
1991 Q4 945.9978 404.0032 453.1201 10.37
1992 Q1 945.5183 404.4774 453.9991 10.60
1992 Q2 945.3514 404.7868 454.9552 11.00
1992 Q3 945.2918 405.2710 455.4824 11.40
1992 Q4 945.4008 405.3830 456.1009 11.73
1993 Q1 945.9058 405.1564 457.2027 11.07
1993 Q2 945.9035 406.4700 457.3886 11.67
1993 Q3 946.3190 406.2293 457.7799 11.47
1993 Q4 946.5796 406.7265 457.5535 11.30
1994 Q1 946.7800 408.5785 458.8024 10.97
1994 Q2 947.6283 409.6767 459.0564 10.63
1994 Q3 948.6221 410.3858 459.1578 10.10
1994 Q4 949.3992 410.5395 459.7037  9.67
1995 Q1 949.9481 410.4453 459.7037  9.53
1995 Q2 949.7945 410.6256 460.0258  9.47
1995 Q3 949.9534 410.8672 461.0257  9.50
1995 Q4 950.2502 411.2359 461.3039  9.27
1996 Q1 950.5380 410.6637 461.4031  9.50
1996 Q2 950.7871 410.8085 462.9277  9.43
1996 Q3 950.8695 412.1160 464.6888  9.70
1996 Q4 950.9281 412.9994 465.0717  9.90
1997 Q1 951.8457 412.9551 464.2851  9.43
1997 Q2 952.6005 412.8241 464.0344  9.30
1997 Q3 953.5976 413.0489 463.4535  8.87
1997 Q4 954.1434 413.6110 465.0717  8.77
1998 Q1 954.5426 413.6048 466.0889  8.60
1998 Q2 955.2631 412.9684 466.6171  8.33
1998 Q3 956.0561 412.2659 465.7478  8.17
1998 Q4 956.7966 412.9106 465.8995  8.03
1999 Q1 957.3865 413.8294 466.4099  7.90
1999 Q2 958.0634 414.2242 466.9552  7.87
1999 Q3 958.7166 415.1678 467.6281  7.53
1999 Q4 959.4881 415.7016 467.7026  6.93
2000 Q1 960.3625 416.8674 469.1348  6.80
2000 Q2 960.7834 417.6104 469.3364  6.70
2000 Q3 961.0290 418.0030 470.0117  6.93
2000 Q4 961.7657 417.2667 469.6472  6.87

後から予測精度を確かめるためにテスト用データを作成します。 Canadaの1999以降をテスト用に残しておきます。 テストデータはwindow()関数で切れます。

Canada.1998<-window(Canada,end=c(1998,4))

次数を求める

次数をVARselect()関数で推定します。

VARselect(Canada.1998,lag.max=4) # 四半期ごとのデータなので最大次数を1年 = 4Q = 4とする
#$`selection`
#AIC(n)  HQ(n)  SC(n) FPE(n) 
#     3      2      1      3 

#$criteria
                  1            2           3           4
#AIC(n) -5.851596560 -6.202550793 -6.35903190 -6.10327091
#HQ(n)  -5.599833359 -5.749377031 -5.70444758 -5.24727603
#SC(n)  -5.219189305 -5.064217733 -4.71477304 -3.95308624
#FPE(n)  0.002877881  0.002034932  0.00175898  0.00231824

AIC基準では3が最もよさそうという結果になったので、次数を3としてVARモデルを推定します。

VARモデルの推計

上の次数の結果に基づいてVARモデルを推計します。

Canada.var<-VAR(Canada.1998,p=VARselect(Canada,lag.max=4)$selection[1])#次数を求める関数をそのまま代入している
Canada.var
#そのまま変数名を入力すると係数ごとの傾きが表示される
VAR Estimation Results:
======================= 

Estimated coefficients for equation e: 
====================================== 
Call:
e = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

         e.l1       prod.l1         rw.l1          U.l1          e.l2       prod.l2 
   1.76191104    0.24270698   -0.12209281    0.41181542   -1.00264304   -0.07465599 
        rw.l2          U.l2          e.l3       prod.l3         rw.l3          U.l3 
  -0.05342660    0.08892120    0.74355811   -0.04541993   -0.01763639    0.55871276 
        const 
-448.95683723 


Estimated coefficients for equation prod: 
========================================= 
Call:
prod = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

       e.l1     prod.l1       rw.l1        U.l1        e.l2     prod.l2       rw.l2 
-0.18076570  1.09401701  0.06001949 -0.83699457 -0.31157932 -0.18920421 -0.19576876 
       U.l2        e.l3     prod.l3       rw.l3        U.l3       const 
 0.66532727  0.51479502  0.07336486  0.12727934  0.32308748 -9.57355935 


Estimated coefficients for equation rw: 
======================================= 
Call:
rw = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

        e.l1      prod.l1        rw.l1         U.l1         e.l2      prod.l2 
 -0.39196529  -0.03030170   0.91229877   0.19515279   0.64738146  -0.27182287 
       rw.l2         U.l2         e.l3      prod.l3        rw.l3         U.l3 
 -0.19360360  -0.37062525  -0.15425373   0.24794909   0.22171192   0.09690049 
       const 
-44.96936619 


Estimated coefficients for equation U: 
====================================== 
Call:
U = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

         e.l1       prod.l1         rw.l1          U.l1          e.l2       prod.l2 
 -0.618475964  -0.157085432   0.053537641   0.412764397   0.349336595   0.068184527 
        rw.l2          U.l2          e.l3       prod.l3         rw.l3          U.l3 
  0.088963473  -0.204287186  -0.119452697   0.032366478  -0.004095136  -0.058335117 
        const 
336.947159989 

summary()で推計結果が表示されます

summary(Canada.var)

VAR Estimation Results:
========================= 
Endogenous variables: e, prod, rw, U 
Deterministic variables: const 
Sample size: 73 
Log Likelihood: -129.623 
Roots of the characteristic polynomial:
0.9862 0.9315 0.9315 0.7378 0.7378 0.6746 0.6746 0.5885 0.5885 0.2398 0.2081 0.2081
Call:
VAR(y = Canada.1998, p = VARselect(Canada, lag.max = 4)$selection[1])


Estimation results for equation e: 
================================== 
e = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

          Estimate Std. Error t value Pr(>|t|)    
e.l1       1.76191    0.14337  12.289  < 2e-16 ***
prod.l1    0.24271    0.06110   3.972 0.000193 ***
rw.l1     -0.12209    0.05109  -2.390 0.020018 *  
U.l1       0.41182    0.20226   2.036 0.046166 *  
e.l2      -1.00264    0.23140  -4.333 5.69e-05 ***
prod.l2   -0.07466    0.09159  -0.815 0.418226    
rw.l2     -0.05343    0.06586  -0.811 0.420424    
U.l2       0.08892    0.23976   0.371 0.712039    
e.l3       0.74356    0.16165   4.600 2.24e-05 ***
prod.l3   -0.04542    0.06269  -0.725 0.471551    
rw.l3     -0.01764    0.05292  -0.333 0.740107    
U.l3       0.55871    0.20354   2.745 0.007974 ** 
const   -448.95684   94.69463  -4.741 1.35e-05 ***
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 0.3143 on 60 degrees of freedom
Multiple R-Squared: 0.9986,  Adjusted R-squared: 0.9984 
F-statistic:  3647 on 12 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation prod: 
===================================== 
prod = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

         Estimate Std. Error t value Pr(>|t|)    
e.l1     -0.18077    0.30130  -0.600   0.5508    
prod.l1   1.09402    0.12841   8.520 6.39e-12 ***
rw.l1     0.06002    0.10737   0.559   0.5782    
U.l1     -0.83699    0.42506  -1.969   0.0536 .  
e.l2     -0.31158    0.48629  -0.641   0.5241    
prod.l2  -0.18920    0.19248  -0.983   0.3296    
rw.l2    -0.19577    0.13840  -1.415   0.1624    
U.l2      0.66533    0.50387   1.320   0.1917    
e.l3      0.51480    0.33972   1.515   0.1349    
prod.l3   0.07336    0.13174   0.557   0.5797    
rw.l3     0.12728    0.11122   1.144   0.2570    
U.l3      0.32309    0.42776   0.755   0.4530    
const    -9.57356  199.00380  -0.048   0.9618    
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 0.6606 on 60 degrees of freedom
Multiple R-Squared: 0.9687,  Adjusted R-squared: 0.9624 
F-statistic: 154.6 on 12 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation rw: 
=================================== 
rw = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

        Estimate Std. Error t value Pr(>|t|)    
e.l1     -0.3920     0.3543  -1.106   0.2730    
prod.l1  -0.0303     0.1510  -0.201   0.8416    
rw.l1     0.9123     0.1263   7.226 1.03e-09 ***
U.l1      0.1951     0.4998   0.390   0.6976    
e.l2      0.6474     0.5718   1.132   0.2621    
prod.l2  -0.2718     0.2263  -1.201   0.2345    
rw.l2    -0.1936     0.1628  -1.190   0.2389    
U.l2     -0.3706     0.5925  -0.626   0.5340    
e.l3     -0.1542     0.3995  -0.386   0.7008    
prod.l3   0.2480     0.1549   1.601   0.1147    
rw.l3     0.2217     0.1308   1.695   0.0952 .  
U.l3      0.0969     0.5030   0.193   0.8479    
const   -44.9694   234.0138  -0.192   0.8483    
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 0.7768 on 60 degrees of freedom
Multiple R-Squared: 0.9988,  Adjusted R-squared: 0.9986 
F-statistic:  4208 on 12 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation U: 
================================== 
U = e.l1 + prod.l1 + rw.l1 + U.l1 + e.l2 + prod.l2 + rw.l2 + U.l2 + e.l3 + prod.l3 + rw.l3 + U.l3 + const 

          Estimate Std. Error t value Pr(>|t|)    
e.l1     -0.618476   0.120389  -5.137 3.19e-06 ***
prod.l1  -0.157085   0.051307  -3.062  0.00329 ** 
rw.l1     0.053538   0.042900   1.248  0.21689    
U.l1      0.412764   0.169836   2.430  0.01809 *  
e.l2      0.349337   0.194304   1.798  0.07723 .  
prod.l2   0.068185   0.076906   0.887  0.37884    
rw.l2     0.088963   0.055299   1.609  0.11292    
U.l2     -0.204287   0.201326  -1.015  0.31432    
e.l3     -0.119453   0.135739  -0.880  0.38236    
prod.l3   0.032366   0.052638   0.615  0.54096    
rw.l3    -0.004095   0.044438  -0.092  0.92688    
U.l3     -0.058335   0.170914  -0.341  0.73406    
const   336.947160  79.514000   4.238 7.90e-05 ***
---
Signif. codes:  0***0.001**0.01*0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 0.2639 on 60 degrees of freedom
Multiple R-Squared: 0.9739,  Adjusted R-squared: 0.9687 
F-statistic: 186.8 on 12 and 60 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
            e       prod       rw          U
e     0.09881 -1.271e-02 -0.04492 -5.703e-02
prod -0.01271  4.364e-01  0.04093 -1.188e-05
rw   -0.04492  4.093e-02  0.60344  4.412e-02
U    -0.05703 -1.188e-05  0.04412  6.967e-02

Correlation matrix of residuals:
            e       prod       rw          U
e     1.00000 -6.122e-02 -0.18395 -6.874e-01
prod -0.06122  1.000e+00  0.07977 -6.812e-05
rw   -0.18395  7.977e-02  1.00000  2.152e-01
U    -0.68740 -6.812e-05  0.21516  1.000e+00

モデルの可視化もできます

plot(Canada.var)

f:id:imakoto0323:20181018101052p:plain

モデルの予測は次回の記事で