-- begin included message -- I am trying to set up a data set for a survival analysis with time-varying covariates. The data is already in a long format, but does not have a variable to signify the stopping point for the interval. The variable DaysEnrolled is the variable I would like to use to form this interval. This is what I have now:
... ---- end inclusion I would have expected a dozen solutions from the list - data manipulation problems usually get a large following. It can be done in 4 lines, assuming that the parent data set is sorted by subject and time within subject. newdata$start <- olddata$DaysEnrolled #start time = the current variable temp <- olddata$DaysEnrolled[-1] # shift column up by one position temp[diff(olddata$id) !=0] <- NA # NA for last line of each subject newdata$stop <- c(temp, NA) # add the NA for the last subject I will leave it to others to compress this into a 1-line application of one of the apply functions. (Unreadable perhaps, but definitely more elegant :-) Terry T. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.