Hi--I’m new to R. For a dissertation, my panel data is for 48 Sub-Saharan countries (cross-sectional index=’i’) over 55 years 1960-2014 (time-series index=’t’). The variables read into R from a text file are levels data. The 2SLS regression due to reverse causality will be based on change in the levels data, so will need to difference the data grouped by cross-sectional index ‘i’.
There are nearly 50 total variables, but the model essentially will regress the differenced Yit ~ X1it+X2it+X3it+X4it+X5it+X6it, with a dummy variable attached to each of the change-X(s). Due to missing data, R originally classified each X and Y variable as a ‘factor’, subsequently changed to ‘numeric’ via ‘as.numeric’ command. However, when I write the following command for dplr solely to difference Yit (=Yit-Yi[t-1]) mutated to new variable dYit, I receive error messages to the effect that Yit and each of the X variables are ‘factors’. >library (dplr) >dt = CSUdata2 %>% group_by (i) %>% (dYit=Yit-lag(Yit)) ‘CSUdata2’ is the object in which the tab-delimited text file dataset is stored. Questions: Any idea why dplyr reads the variables as ‘factors’? A class(*) command per variable shows R to know each Y and X as ‘numeric’. Is the command to difference Yit done correctly? I plan to use the same command for each variable requiring change until I understand the commands better. Thank you. Sent from Windows Mail [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.