Here is yet another way of doing it (always the case in R):

#Simulated data frame: year from 1990 to 2003, for 5 different ids, each
having one or two eif "events"
test<-data.frame(year=rep(1990:2003,5),id=gl(5,length(1990:2003)),
    eif=as.vector(sapply(1:5,function(z){
        a<-rep(0,length(1990:2003))
        a[sample(1:length(1990:2003),sample(1:2,1))]<-1
        a
    })))

# partition by 'id' and then by 'eif' changes
test.new <- do.call(rbind, lapply(split(test, test$id), function(.id){
    # now by 'eif' changes
    do.call(rbind, lapply(split(.id, cumsum(.id$eif)), function(.eif){
        # create new dataframe with column
        cbind(.eif, conditional_time=seq(nrow(.eif)))
    }))
}))



On Sat, May 9, 2009 at 1:40 PM, Vincent Arel-Bundock <vincent.a...@gmail.com
> wrote:

>  Hi everyone,
>
> Please forgive me if my question is simple and my code terrible, I'm new to
> R. I am not looking for a ready-made answer, but I would really appreciate
> it if someone could share conceptual hints for programming, or point me
> toward an R function/package that could speed up my processing time.
>
> Thanks a lot for your help!
>
> ##
>
> My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9
> million id-year observations
>
> I would like to do 2 things:
>
> -1- I want to create a 'conditional_time' variable, which increases in
> increments of 1 every year, but which resets during year(t) if event 'eif'
> occured for this 'id' at year(t-1). It should also reset when we switch to
> a
> new 'id'. For example:
>
> dataframe = test
>  year        id         eif  conditional_time
>
> 1990       1010          0    1
> 1991       1010          0    2
> 1992       1010          1    3
> 1993       1010          0    1
> 1994       1010          0    2
> 1995       1010          0    3
> 1996       1010          0    4
> 1997       1010          1    5
> 1998       1010          0    1
> 1999       1010          0    2
> 2000       1010          0    3
> 2001       1010          0    4
> 2002       1010          0    5
> 2003       1010          0    6
> 1990       2010          0    1
> 1991       2010          0    2
> 1992       2010          0    3
> 1993       2010          0    4
> 1994       2010          0    5
> 1995       2010          0    6
> 1996       2010          0    7
> 1997       2010          0    8
> 1998       2010          0    9
> 1999       2010          0    10
> 2000       2010          0    11
> 2001       2010          1    12
> 2002       2010          0    1
> 2003       2010          0    2
>
> -2- In a copy of the original dataframe, drop all id-year rows that
> correspond to years after a given id has experienced his first 'eif' event.
>
> I have written the code below to take care of -1-, but it is incredibly
> inefficient. Given the size of my database, and considering how slow my
> computer is, I don't think it's practical to use it. Also, it depends on
> correct sorting of the dataframe, which might generate errors.
>
> ##
>
> for (i in 1:nrow(test)) {
>    if (i == 1) {                            # If first id-year
>        cond_time <- 1
>        test[i, 4] <- cond_time
>
>    } else if ((test[i-1, 1]) != (test[i, 4])) {             # If new id
>        cond_time <- 1
>        test[i, 4] <- cond_time
>     } else {                            # Same id as previous row
>        if (test[i, 3] == 0) {
>            test[i, 4] <- sum(cond_time, 1)
>            cond_time <- test[i, 6]
>        } else {
>            test[i, 4] <- sum(cond_time, 1)
>            cond_time <- 0
>            }
>        }
> }
>
> --
> Vincent Arel
> M.A. Student, McGill University
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to