Dear R users,
I am encoutering the following problem: I have a dataset with a 'unique_id' and
different 'visit_date' (formatted as.Date, "%d/%m/%Y") per unique_id. I would
like to create a new variable with the most recent date of visit per unique_id
as shown below.
unique_id visit_date last_visit_date
1 01/06/2010 01/06/2011
1 01/01/2011 01/06/2011
1 01/06/2011 01/06/2011
2 01/01/2009 01/07/2011
2 01/06/2009 01/07/2011
2 01/06/2010 01/07/2011
2 01/01/2011 01/07/2011
2 01/07/2011 01/07/2011
3 01/01/2008 01/01/2008
4 01/01/2009 01/01/2010
4 01/01/2010 01/01/2010
I know the coding to easily do this in Stata, SAS, and Excel but I cannot find
how to do it in R. I try multiple function such as tapply( ), ave( ), ddply (
), and transform ( ) after looking into previous postings. The codes are
running but only NA values are generated or I get error messages that the
replacement has less row than the data has (there are about 1000 unique_id and
over 4000 rows in my dataset presently).
I would greatly appreciate if someone could help me.
Thank you!
Kathleen R.
Epidemiologist
Montreal, QC, Canada
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.