This was ensured while using ddply()... On Thu, Dec 15, 2016 at 6:04 PM, Brijesh Mishra <brijeshkmis...@gmail.com> wrote: > Dear Mr Hasselman, > > I missed you mail, while I was typing my own mail as a reply to Mr. > Barradas suggestion. In fact, I implemented your suggestion even > before reading it. But, I have a concern that I have noted (though its > only hypothetical- such a scenario is very unlikely to occur). Is > there a way to restrict such calculations co_code1 wise? > > Many thanks, > > Brijesh > > On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <b...@xs4all.nl> wrote: >> >>> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmis...@gmail.com> wrote: >>> >>> Hi, >>> >>> I am trying to calculate growth rate (say, sales, though it is to be >>> computed for many variables) in a panel data set. Problem is that I >>> have missing data for many firms for many years. To put it simply, I >>> have created this short dataframe (original df id much bigger) >>> >>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7), >>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3)) >>> >>> # this gives me >>> co_code1 fyear1 sales1 >>> 1 1100 1990 1000 >>> 2 1100 1991 1100 >>> 3 1100 1992 1200 >>> 4 1100 1993 1300 >>> 5 1100 1994 1400 >>> 6 1100 1995 1500 >>> 7 1100 1996 1600 >>> 8 1200 1990 1000 >>> 9 1200 1991 1100 >>> 10 1200 1992 1200 >>> 11 1200 1993 1300 >>> 12 1200 1994 1400 >>> 13 1200 1995 1500 >>> 14 1200 1996 1600 >>> 15 1300 1990 1000 >>> 16 1300 1991 1100 >>> 17 1300 1992 1200 >>> 18 1300 1993 1300 >>> 19 1300 1994 1400 >>> 20 1300 1995 1500 >>> 21 1300 1996 1600 >>> >>> # I am now removing a couple of rows >>> df1<-df1[-c(5, 8), ] >>> # the result is >>> co_code1 fyear1 sales1 >>> 1 1100 1990 1000 >>> 2 1100 1991 1100 >>> 3 1100 1992 1200 >>> 4 1100 1993 1300 >>> 6 1100 1995 1500 >>> 7 1100 1996 1600 >>> 9 1200 1991 1100 >>> 10 1200 1992 1200 >>> 11 1200 1993 1300 >>> 12 1200 1994 1400 >>> 13 1200 1995 1500 >>> 14 1200 1996 1600 >>> 15 1300 1990 1000 >>> 16 1300 1991 1100 >>> 17 1300 1992 1200 >>> 18 1300 1993 1300 >>> 19 1300 1994 1400 >>> 20 1300 1995 1500 >>> 21 1300 1996 1600 >>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been >>> removed. If I try, >>> d<-ddply(df1,"co_code1",transform, >>> growth=c(NA,exp(diff(log(sales1)))-1)*100) >>> >>> # this apparently gives wrong results for the year 1995 (as shown >>> below) as growth rates are computed considering yearly increment. >>> >>> co_code1 fyear1 sales1 growth >>> 1 1100 1990 1000 NA >>> 2 1100 1991 1100 10.000000 >>> 3 1100 1992 1200 9.090909 >>> 4 1100 1993 1300 8.333333 >>> 5 1100 1995 1500 15.384615 >>> 6 1100 1996 1600 6.666667 >>> 7 1200 1991 1100 NA >>> 8 1200 1992 1200 9.090909 >>> 9 1200 1993 1300 8.333333 >>> 10 1200 1994 1400 7.692308 >>> 11 1200 1995 1500 7.142857 >>> 12 1200 1996 1600 6.666667 >>> 13 1300 1990 1000 NA >>> 14 1300 1991 1100 10.000000 >>> 15 1300 1992 1200 9.090909 >>> 16 1300 1993 1300 8.333333 >>> 17 1300 1994 1400 7.692308 >>> 18 1300 1995 1500 7.142857 >>> 19 1300 1996 1600 6.666667 >>> # I thought of using the formula only when the increment of fyear1 is >>> only 1 while in a co_code1, by using this formula >>> >>> d<-ddply(df1, >>> "co_code1", >>> transform, >>> if(diff(fyear1)==1){ >>> growth=(exp(diff(log(df1$sales1)))-1)*100 >>> } else{ >>> growth=NA >>> }) >>> >>> But, this doesn't work. I am getting the following error. >>> >>> In if (diff(fyear1) == 1) { : >>> the condition has length > 1 and only the first element will be used >>> (repeated a few times). >>> >>> # I have searched for a solution, but somehow couldn't get one. Hope >>> that some kind soul will guide me here. >>> >> >> In your case use ifelse() as explained by Rui. >> But it can be done more easily since the fyear1 and co_code1 are >> synchronized. >> Add a new column to df1 like this >> >> df1$growth <- c(NA, >> ifelse(diff(df1$fyear1)==1, >> (exp(diff(log(df1$sales1)))-1)*100, >> NA >> ) >> ) >> >> and display df1. From your request I cannot determine if this is what you >> want. >> >> regards, >> >> Berend Hasselman >>
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.