Hi R Users I am hoping someone might be able to give some pointers on alternative code to the for loop described below.
I have a dataset which is ordered by subject ID and date, what I would like to do is create a new variable that numbers the entries for each person (e.g. 1,2,3,....) As an example if we have subjects A, B and C all with multiple entries (have excluded date variable for simplicity), the for loop below achieves the desired result, however my dataset is big (1 million + observations) and the for loop is slow. Is there a more efficient way of getting to the desired result? Many thanks in advance Toni A <- data.frame(ID=c('A','A','A','A','B','B','B', 'C','C','C','C','C')) ID 1 A 2 A 3 A 4 A 5 B 6 B 7 B 8 C 9 C 10 C 11 C 12 C A$Session_ID <- 0 previous_ID <- '' current_index <- 1 for ( i in seq(1,nrow(A)) ) { if (A$ID[i] != previous_ID) {current_index <- 1} A$Session_ID[i] <- current_index previous_ID <- A$ID[i] current_index <- current_index + 1 } ID Session_ID 1 A 1 2 A 2 3 A 3 4 A 4 5 B 1 6 B 2 7 B 3 8 C 1 9 C 2 10 C 3 11 C 4 12 C 5 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.