So, I know that's a confusing Subject header. Here's similar data:
tmp <- data.frame(matrix( c(rbinom(1000, 1, .03), array(1:127, c(1000,1)), array(format(seq(ISOdate(1990,1,1), by='month', length=56), format='%d.%m.%Y'), c(1000,1))), ncol=3)) tmp <- tmp[with(tmp, order(X2, X3)), ] table(tmp$X1) X1 is the variable of interest - disease status. It's a survival-type of variable, where you are 0 until you become 1. X2 is the person ID variable. X3 is the clinic date (here it's monthly, just for example...but in my real data it's a bit more complicated - definitely not equally spaced nor the same number of visits to the clinic per ID.). Some people stay X1 = 0 for all clinic visits. Only a small proportion become X1=1. However, the data has errors I need to clean off. Once someone becomes X1=1 they should have no more rows in the dataset. These are data entry errors. In my data I have people who continue to have rows in the data. Sometimes the rows show X1=0 and sometimes X1=1. Sometimes there's just one more row and sometimes there are many more rows. How can I go through, find the first X1 = 1, and then delete any rows after that, for each value of X2? Thanks! Jen [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.