Dear Ivo, dear list, (see: Message: 70 Date: Thu, 26 Mar 2009 21:39:19 +0000 From: ivo...@gmail.com Subject: [R] pgmm (Blundell-Bond) sample needed)
I think I finally figured out how to replicate your supersimple GMM example with pgmm() so as to get the very same results as Stata. Having no other regressors in the formula initially drove me crazy. This was a case where simpler models are trickier than more complicated ones! For the benefit of other GMM people on this list, here's a brief résumé of our rather long private mail exchange of these days, answering to some other pgmm()-related posts which have appeared on this list lately. Sorry for the overlong posting but it might be worth the space. I will refer to the very good Stata tutorial by David Roodman that Ivo himself pointed me to, which gives a nice (and free) theoretical intro as well. Please (the others) find it here: http://repec.org/nasug2006/howtodoxtabond2.cgdev.pdf. As far as textbooks are concerned, Arellano's panel data book (Oxford) is the theoretical reference I would suggest. There have been two separate issues: - syntax (how to get the right model) - small sample behaviour (minimal time dimension to get estimates) I'll start with this last one, then provide a quick "Rosetta stone" of pgmm() and Stata commands producing the same results. The established benchmarks for dynamic panels' GMM are the DPD routines written by Arellano et al. for Gauss and later Ox, but Stata is proven to give the same results, and it is the established general reference for panel data. Lastly I will add the usual examples found in the literature, although they are very close relatives of 'example(pgmm)', so as to show the correspondence between the models. 1) Small samples and N-asymptotics: GMM needs big N, small T. Else you end up having more instruments than observations and you get a "singular matrix" error (which, as Ivo correctly found out, happens in the computation of the optimal weights' matrix). While this is probably going to be substituted with a more descriptive error message, it still explains you the heart of the matter. Yet Stata gives you estimates in this case as well: as I suspected, it is because it uses a generalized inverse (see Roodman's tutorial, 2.6). This looks theoretically ok. Whether this is meaningful in applied practice is an issue I will discuss with the package maintainer. IMHO it is not, apart maybe for illustrative purposes, and it might well encourage bad habits (see the discussion about (not) fitting the Grunfeld model by GMM on this list, some weeks ago). 2) fitting the simple models Simplest possible model: AR(1) with individual effects x(i,t)= a*(x(i,t-1)) + bi + c This is what Ivo asked for in the first place. As the usual example is on data from the Arellano and Bond paper, available in package 'plm' as > data(EmplUK) I'll use log(emp) from this dataset as 'x', for ease of reproducibility. Same data are available in Stata by 'use "http://www.stata-press.com/data/r7/abdata.dta"'. The Stata dataset is identical but for the variable names and the fact that in Stata you have to generate logs beforehand (ugh!). I'm also adding the 'nomata' option to avoid complications, but this will be unnecessary on most systems (not on mine...). The system-GMM estimator (with robust SEs) in Stata is 'xtabond2 n nL1, gmm(L.(n)) nomata robust' whose R equivalent is: > sysmod<-pgmm( dynformula( log(emp) ~ 1, list(1)), data=EmplUK, > gmm.inst=~log(emp), lag.gmm=c(2,99), + effect="individual", model="onestep", transformation="ld" ) > summary(sysmod, robust=TRUE) (note that although 'summary(sysmod)' does not report a constant, it's actually there; this is an issue to be checked). while the difference-GMM is 'xtabond2 n nL1, gmm(L.(n)) noleveleq nomata robust', in R: > diffmod<-pgmm( dynformula( log(emp) ~ 1, list(1)), data=EmplUK, > gmm.inst=~log(emp), lag.gmm=c(2,99), + effect="individual", model="onestep", transformation="d" ) > summary(diffmod,robust=TRUE) The particular model Ivo asked for, using only lags 2-4 as instruments, is 'xtabond2 x lx, gmm(L.(x),lag(1 3)) robust' in Stata and only requires to set 'lag.gmm=c(2,4)' in the 'sysmod' above (notice the difference in the lags specification!). Note also that, unlike Ivo, I am using robust covariances. 3) fitting the standard examples from the literature. 'example(pgmm)' is a somewhat simplified version of the standard Arellano-Bond example. For better comparability, here I am replicating the results from the abest.do Stata script from http://ideas.repec.org/c/boc/bocode/s435901.html (i.e., the results of the Arellano and Bond paper done via xtabond2). The same output is also to be found in Roodman's tutorial, 3.3. Here's how to replicate the output of abest.do: (must execute the preceding lines in the file as well for data transf.) * Replicate difference GMM runs in Arellano and Bond 1991, Table 4 * Column (a1) xtabond2 n L(0/1).(l.n w) l(0/2).(k ys) yr198?c cons, gmm(L.n) iv(L(0/1).w l(0/2).(k ys) yr198?c cons) noleveleq noconstant robust nomata replicated by: > abmod1<-pgmm(dynformula(log(emp)~log(wage)+log(capital)+log(output),list(2,1,2,2)), + data=EmplUK, effect="twoways", model="onestep", + gmm.inst=~log(emp), lag.gmm=list(c(2,99)), transformation="d") > summary(abmod1,robust=TRUE) * Column (a2) xtabond2 n L(0/1).(l.n w) l(0/2).(k ys) yr198?c cons, gmm(L.n) iv(L(0/1).w l(0/2).(k ys) yr198?c cons) noleveleq noconstant two nomata replicated by: > mymod2<-pgmm(dynformula(log(emp)~log(wage)+log(capital)+log(output),list(2,1,2,2)), + data=EmplUK, effect="twoways", model="twosteps", + gmm.inst=~log(emp), lag.gmm=list(c(2,99)), transformation="d") > summary(mymod2) * Column (b) xtabond2 n L(0/1).(l.n ys w) k yr198?c cons, gmm(L.n) iv(L(0/1).(ys w) k yr198?c cons) noleveleq noconstant two nomata replicated by: > abmod3<-pgmm(dynformula(log(emp)~log(wage)+log(capital)+log(output),list(2,1,0,1)), + data=EmplUK, effect="twoways", model="twosteps", + gmm.inst=~log(emp), lag.gmm=list(c(2,99)), transformation="d") > summary(abmod3) The system versions from bbest.do (ibid.) can be estimated setting 'transformation="ld"' along the lines in 2) Yves has done a great job, but any GMM estimator is bound to have a lot of switches. I hope this helps, together with Roodman's paper, in putting instruments and lags into place. Sure it helped me to better figure out what pgmm() does. Thanks to Ivo for motivating me and for ultimately finding most of the answers by himself :^) Giovanni Giovanni Millo Research Dept., Assicurazioni Generali SpA Via Machiavelli 4, 34132 Trieste (Italy) tel. +39 040 671184 fax +39 040 671160 Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:13}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.