Hi everyone, Please forgive me if my question is simple and my code terrible, I'm new to R. I am not looking for a ready-made answer, but I would really appreciate it if someone could share conceptual hints for programming, or point me toward an R function/package that could speed up my processing time.
Thanks a lot for your help! ## My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9 million id-year observations I would like to do 2 things: -1- I want to create a 'conditional_time' variable, which increases in increments of 1 every year, but which resets during year(t) if event 'eif' occured for this 'id' at year(t-1). It should also reset when we switch to a new 'id'. For example: dataframe = test year id eif conditional_time 1990 1010 0 1 1991 1010 0 2 1992 1010 1 3 1993 1010 0 1 1994 1010 0 2 1995 1010 0 3 1996 1010 0 4 1997 1010 1 5 1998 1010 0 1 1999 1010 0 2 2000 1010 0 3 2001 1010 0 4 2002 1010 0 5 2003 1010 0 6 1990 2010 0 1 1991 2010 0 2 1992 2010 0 3 1993 2010 0 4 1994 2010 0 5 1995 2010 0 6 1996 2010 0 7 1997 2010 0 8 1998 2010 0 9 1999 2010 0 10 2000 2010 0 11 2001 2010 1 12 2002 2010 0 1 2003 2010 0 2 -2- In a copy of the original dataframe, drop all id-year rows that correspond to years after a given id has experienced his first 'eif' event. I have written the code below to take care of -1-, but it is incredibly inefficient. Given the size of my database, and considering how slow my computer is, I don't think it's practical to use it. Also, it depends on correct sorting of the dataframe, which might generate errors. ## for (i in 1:nrow(test)) { if (i == 1) { # If first id-year cond_time <- 1 test[i, 4] <- cond_time } else if ((test[i-1, 1]) != (test[i, 4])) { # If new id cond_time <- 1 test[i, 4] <- cond_time } else { # Same id as previous row if (test[i, 3] == 0) { test[i, 4] <- sum(cond_time, 1) cond_time <- test[i, 6] } else { test[i, 4] <- sum(cond_time, 1) cond_time <- 0 } } } -- Vincent Arel M.A. Student, McGill University [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.