I have created a few versions of a probit model that predicts (gives me a probability between 0 and 1) of a recession in the United States in the next 12 months. It uses some well known economic time series data I got from the St. Lewis Fed’s website. I got this to work with the following code:
#rebuild the object to include only the data I want in the model predictors.TS <- cbind(NAPM.TS, FEDFUNDS.TS, IC4WSA.TS, CurveSlope.TS, CreditSpread.TS, MCOILWTI.real.TS, sp500Ret.TS) recession.probModel <- glm(formula = window(lag(recession.TS,k=12),start=c(1986,2),end=c(2010,7)) ~ window(predictors.TS,start=c(1986,2),end=c(2010,7)), family=binomial(link="probit")) that all works nicely and looks like I expected: > summary(recession.probModel) Call: glm(formula = window(lag(recession.TS, k = 12), start = c(1986, 2), end = c(2010, 7)) ~ window(predictors.TS, start = c(1986, 2), end = c(2010, 7)), family = binomial(link = "probit")) Deviance Residuals: Min 1Q Median 3Q Max -1.70829 -0.12217 -0.01181 0.00000 2.31322 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.513e+01 6.122e+00 2.472 0.01345 * window(predictors.TS, start = c(1986, 2), end = c(2010, 7))NAPM.TS -2.128e-01 7.017e-02 -3.032 0.00243 ** window(predictors.TS, start = c(1986, 2), end = c(2010, 7))FEDFUNDS.TS 2.993e-01 1.140e-01 2.626 0.00864 ** window(predictors.TS, start = c(1986, 2), end = c(2010, 7))IC4WSA.TS -2.822e-05 9.675e-06 -2.917 0.00353 ** window(predictors.TS, start = c(1986, 2), end = c(2010, 7))CurveSlope.TS -4.451e-01 3.319e-01 -1.341 0.17990 window(predictors.TS, start = c(1986, 2), end = c(2010, 7))CreditSpread.TS -2.209e-01 1.300e+00 -0.170 0.86507 window(predictors.TS, start = c(1986, 2), end = c(2010, 7))MCOILWTI.real.TS 9.426e-02 2.122e-02 4.442 8.9e-06 *** window(predictors.TS, start = c(1986, 2), end = c(2010, 7))sp500Ret.TS -8.458e-01 3.799e+00 -0.223 0.82384 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 222.512 on 293 degrees of freedom Residual deviance: 97.135 on 286 degrees of freedom AIC: 113.14 Number of Fisher Scoring iterations: 10 Now what I want to get it to do is use the model I just estimated to predict a recession probability in, say, 2010-8, a period for which I do have data. To do that I tried the following: predict.glm(object=recession.probModel, newdata=window(predictors.TS,start=c(2010,8),end=c(2010,8)), type=”response”) I expected this to output one data point but instead it spits out a vector of 286 values, none of which is between 0 and 1. Any idea of how I can get it to tell me what the predicted probability is for Aug 2008 given the data I have for the independent variables? Should I not be trying to do this as a time series? I’m at a bit of a loss here so any help pointing me in the right direction would be appreciated. My ultemate goal is to run a rolling estimation of the model, holding out the most recent periods to see how well it does on out of sample prediction and to get a forecast of the near future. -- View this message in context: http://r.789695.n4.nabble.com/probit-model-on-time-series-tp3882762p3882762.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.