On Tue, 26 Mar 2013, SHISHIR MATHUR wrote:
Thanks for the reply Achim. The reason I suspect autocorrelation is because I think that within the same neighborhood, homes sold a few months back are likely to impact the price of homes sold subsequently.
This may well be spatial (auto)correlation rather than temporal autocorrelation.
In fact the DW test and Breusch-Pagan test come out to be significant. So even though the data is not time series (that is, I do not have repeated observations for the same house), however, the houses sold close in time to each other are in the data set.
If there is a unique ordering of all observations by time, then you could in principle apply an autocorrelation correction for the data, e.g., via Newey-West.
But from what you describe above, it seems to be more important to capture spatial effects in the data, e.g., by using a spatial lag model (see lagsarlm in "spdep") or by using an additive spatial effect (see e.g. gam in "mgcv").
Thanks, Shish On Tue, Mar 26, 2013 at 3:51 PM, Achim Zeileis <achim.zeil...@uibk.ac.at> wrote: On Tue, 26 Mar 2013, SHISHIR MATHUR wrote: Hello: My dataset set contains several thousand rows of data, with each row containing information for a house. The variables include the sale price of the house, the quarter and year of sale, the attributes of the house, and the attributes of the neighborhood and the city in which the house is located. The data is for a 10-year period. No house is repeated in the dataset. In summary, the dataset can be termed pooled cross-section data. My question: Can I estimate Newey-West HAC standard errors for a model that estimates the effect of various independent variables on the sale price of the house? My understanding is that Newey-West can be used for time series and panel data. However, I am not sure whether it can be used for pooled cross-section data. If yes, can you refer me to a specific source, such as a paper or a book? The result of your aggregation is a cross-section data set. Thus, there should be no correlation between the different observations - or in other terms, the ordering of your observations is completely arbitrary. Consequently, there may be heteroskedasticity but not autocorrelation. So you may use HC standard errors but HAC should not be necessary. (Using HAC standard errors will still be consistent but less efficient.) -- Best, Shish [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Best, Shishir
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.