Hi R Users

I am hoping someone might be able to give some pointers on alternative code to 
the for loop described below.

I have a dataset which is ordered by subject ID and date, what I would like to 
do is create a new variable that numbers the entries for each person (e.g. 
1,2,3,....)

As an example if we have subjects A, B and C all with multiple entries (have 
excluded date variable for simplicity), the for loop below achieves the desired 
result, however my dataset is big (1 million + observations) and the for loop 
is slow. Is there a more efficient way of getting to the desired result?

Many thanks in advance

Toni 


A <- data.frame(ID=c('A','A','A','A','B','B','B', 'C','C','C','C','C'))

  ID
1   A
2   A
3   A
4   A
5   B
6   B
7   B
8   C
9   C
10  C
11  C
12  C


A$Session_ID <- 0
previous_ID <- ''
current_index <- 1
for ( i in seq(1,nrow(A)) )
{
 if (A$ID[i] != previous_ID) 
    {current_index <- 1} 
 A$Session_ID[i] <- current_index
 previous_ID <- A$ID[i]
 current_index <- current_index + 1
}

 

ID Session_ID
1   A          1
2   A          2
3   A          3
4   A          4
5   B          1
6   B          2
7   B          3
8   C          1
9   C          2
10  C          3
11  C          4
12  C          5
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to