Dear Mr Hasselman, I missed you mail, while I was typing my own mail as a reply to Mr. Barradas suggestion. In fact, I implemented your suggestion even before reading it. But, I have a concern that I have noted (though its only hypothetical- such a scenario is very unlikely to occur). Is there a way to restrict such calculations co_code1 wise?
Many thanks, Brijesh On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <b...@xs4all.nl> wrote: > >> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmis...@gmail.com> wrote: >> >> Hi, >> >> I am trying to calculate growth rate (say, sales, though it is to be >> computed for many variables) in a panel data set. Problem is that I >> have missing data for many firms for many years. To put it simply, I >> have created this short dataframe (original df id much bigger) >> >> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7), >> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3)) >> >> # this gives me >> co_code1 fyear1 sales1 >> 1 1100 1990 1000 >> 2 1100 1991 1100 >> 3 1100 1992 1200 >> 4 1100 1993 1300 >> 5 1100 1994 1400 >> 6 1100 1995 1500 >> 7 1100 1996 1600 >> 8 1200 1990 1000 >> 9 1200 1991 1100 >> 10 1200 1992 1200 >> 11 1200 1993 1300 >> 12 1200 1994 1400 >> 13 1200 1995 1500 >> 14 1200 1996 1600 >> 15 1300 1990 1000 >> 16 1300 1991 1100 >> 17 1300 1992 1200 >> 18 1300 1993 1300 >> 19 1300 1994 1400 >> 20 1300 1995 1500 >> 21 1300 1996 1600 >> >> # I am now removing a couple of rows >> df1<-df1[-c(5, 8), ] >> # the result is >> co_code1 fyear1 sales1 >> 1 1100 1990 1000 >> 2 1100 1991 1100 >> 3 1100 1992 1200 >> 4 1100 1993 1300 >> 6 1100 1995 1500 >> 7 1100 1996 1600 >> 9 1200 1991 1100 >> 10 1200 1992 1200 >> 11 1200 1993 1300 >> 12 1200 1994 1400 >> 13 1200 1995 1500 >> 14 1200 1996 1600 >> 15 1300 1990 1000 >> 16 1300 1991 1100 >> 17 1300 1992 1200 >> 18 1300 1993 1300 >> 19 1300 1994 1400 >> 20 1300 1995 1500 >> 21 1300 1996 1600 >> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been >> removed. If I try, >> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100) >> >> # this apparently gives wrong results for the year 1995 (as shown >> below) as growth rates are computed considering yearly increment. >> >> co_code1 fyear1 sales1 growth >> 1 1100 1990 1000 NA >> 2 1100 1991 1100 10.000000 >> 3 1100 1992 1200 9.090909 >> 4 1100 1993 1300 8.333333 >> 5 1100 1995 1500 15.384615 >> 6 1100 1996 1600 6.666667 >> 7 1200 1991 1100 NA >> 8 1200 1992 1200 9.090909 >> 9 1200 1993 1300 8.333333 >> 10 1200 1994 1400 7.692308 >> 11 1200 1995 1500 7.142857 >> 12 1200 1996 1600 6.666667 >> 13 1300 1990 1000 NA >> 14 1300 1991 1100 10.000000 >> 15 1300 1992 1200 9.090909 >> 16 1300 1993 1300 8.333333 >> 17 1300 1994 1400 7.692308 >> 18 1300 1995 1500 7.142857 >> 19 1300 1996 1600 6.666667 >> # I thought of using the formula only when the increment of fyear1 is >> only 1 while in a co_code1, by using this formula >> >> d<-ddply(df1, >> "co_code1", >> transform, >> if(diff(fyear1)==1){ >> growth=(exp(diff(log(df1$sales1)))-1)*100 >> } else{ >> growth=NA >> }) >> >> But, this doesn't work. I am getting the following error. >> >> In if (diff(fyear1) == 1) { : >> the condition has length > 1 and only the first element will be used >> (repeated a few times). >> >> # I have searched for a solution, but somehow couldn't get one. Hope >> that some kind soul will guide me here. >> > > In your case use ifelse() as explained by Rui. > But it can be done more easily since the fyear1 and co_code1 are synchronized. > Add a new column to df1 like this > > df1$growth <- c(NA, > ifelse(diff(df1$fyear1)==1, > (exp(diff(log(df1$sales1)))-1)*100, > NA > ) > ) > > and display df1. From your request I cannot determine if this is what you > want. > > regards, > > Berend Hasselman > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.