Hello, I am having a great amount of difficulty running a simple linear regression model with entity and time fixed effects and HAC standard errors. I have a data set with 3 million observations and 30 variables. My data is structured as follows:
NAME STATE YEAR Y X1 X2 1 1 2012 1 1 1 2 1 2012 1 2 7 3 1 2012 1 1 2 4 2 2012 2 4 5 etc. ... For every state in every year, there are about 10,000 row vectors corresponding to individual observations. This is not a longitudinal dataset: an individual surveyed in year 2000 in state 1 is never spoken to again. Nonetheless, I still wish to control for geographical and time fixed effects. To do so, I run the following: > load("data.frame.rda") > library(sandwich) > library(pcse) > model <- lm(data.frame$Y ~ data.frame$X1 + data.frame$X2 + > as.factor(data.frame$state) + as.factor(data.frame$year)) > vcovHAC(model, prewhite = FALSE, adjust = FALSE, sandwich = TRUE, ar.method = > "ols") R will not return any results, yet acts as if it is computing the results. This goes on for 4 hours or more. I wanted to run the following: > library(pcse) > model <- lm(data.frame$Y ~ data.frame$X1 + data.frame$X2 + > as.factor(data.frame$state) + as.factor(data.frame$year)) > model.pcse <- pcse(model, groupN = data.frame$state, groupT = data.frame$year) But I get the error: > Error in pcse(model, groupN = BRFSS_OBESEBALANCED$X_STATE, groupT = > BRFSS_OBESEBALANCED$YEAR) : There cannot be more than nCS*nTS rows in the using data! If there are any workarounds for this problem, I would greatly appreciate learning about them. Thanks, Nicholas Pretnar University of Missouri, Economics npret...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.