Hello, I'm new to R and I'm currently learning to use package AER, which is extremely comprehensive and useful. I have one question related to the diagnostics after ivreg: if I understood well, the Sargan test provided states that the statistic should follow a Chi squared of degrees of freedom equal to the number of excluded instruments minus one. But I read many times that the degrees of freedom of this statistic is supposed to equal the number of overidentifying restrictions, i.e. the number of excluded instruments minus the number of endogenous variables tested. When comparing with Stata results (estat overid after ivreg, same with ivreg2 output), the statistic is the same as the one provided by R, only the p-value changes because the distribution chosen is different. Is this command using a different flavor of the Sargan test ? I did not find the details in the AER pdf. I'm using Rstudio with R 3.0.2 (Windows 7) and AER is up to date. The output I get from R is the following, where the Sargan DF is equal to 5, while I thought it would be equal to 6-3=3. The data comes from Verbeek's econometrics textbook and the example replicates the one in the book. Dependent variable is log of wage, endogenous variables are education, experience and its square (3 of them), excluded instruments are parents' education etc (6 of them).
> ivmodel <- ivreg(lwage76 ~ ed76 + exp76 + exp762 + black + smsa76 + south76 | > daded + momed + libcrd14 + age76 + age762 + nearc4 + black + smsa76 + > south76,+ data = school)> > summary(ivmodel,diagnostics=TRUE) Call: ivreg(formula = lwage76 ~ ed76 + exp76 + exp762 + black + smsa76 + south76 | daded + momed + libcrd14 + age76 + age762 + nearc4 + black + smsa76 + south76, data = school) Residuals: Min 1Q Median 3Q Max -1.63375 -0.22253 0.02403 0.24350 1.32911 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.6064811 0.1126195 40.903 < 2e-16 *** ed76 0.0848507 0.0066061 12.844 < 2e-16 *** exp76 0.0796432 0.0164406 4.844 1.34e-06 *** exp762 -0.0020376 0.0008257 -2.468 0.0136 * black -0.1726723 0.0195231 -8.845 < 2e-16 *** smsa76 0.1521693 0.0165207 9.211 < 2e-16 *** south76 -0.1204765 0.0154904 -7.778 1.01e-14 *** Diagnostic tests: df1 df2 statistic p-value Weak instruments 6 2987 965.450 <2e-16 *** Wu-Hausman 2 2988 1.949 0.143 Sargan 5 NA 3.868 0.569 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.3753 on 2990 degrees of freedom Multiple R-Squared: 0.2868, Adjusted R-squared: 0.2854 Wald test: 178.6 on 6 and 2990 DF, p-value: < 2.2e-16 Would this be caused by the fact that I'm using 2SLS and not GMM (at least I suppose) to estimate the IV model ? I apologize if this comes from a misunderstanding from my part, and I thank you in advance for your help. Best, H. Huber [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.