wookie1976 wrote: > > I am in the process of switching from SAS over to R. I am working on very > large CSV datasets that contain vehicle information. As I am processing > the data, I need to select the first (or sometimes the second) record (by > date) for any records that have the same license plate number. In SAS, > there is a function called 'first.' that can be used on sorted datasets to > pull out those first entries for each occurrence of a particular variable > (in this case the variable is 'license plate') found in the data. I have > spent some time looking around and cannot seem to find an equivalent > function in R. Can anyone recommend an efficient technique that would > pull this off? I assume the database must first be sorted by vehicle > plate and date, and then apply the filter or function. Any help would be > greatly appreciated. > > Thanks, Joe >
For the selection of first and last elements from a list, data frame or matrices, look at the head() or tail() functions. The split() function can be used to subset a data.frame into smaller collections based on factors such as the year or license plate. There is a way to combine the effects of split() with another function such as head() using the base function by() or a function like ddply() from Hadley's plyr package. To give an example, I would require some example data (preferable pasted as the output from dput(), tabularized data tends to get mangled in email and requires reprocessing and reformatting before it can be loaded as an R object). -Charlie -- View this message in context: http://n4.nabble.com/First-Last-Data-row-selection-tp1566260p1566418.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.