Or how about rle/cumsum, as per the appended? -- Mike
> myData <- read.table("junk.dat", header=TRUE, stringsAsFactors=FALSE) > myData Date Time O H L C U D 1 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 2 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 3 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 5 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 6 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 7 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 8 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 9 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 10 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > entriesPerDay <- rle(myData[ , 1]) > myData[cumsum(entriesPerDay$lengths), 2] [1] 1700 338 > On Wed, Aug 14, 2013 at 2:22 PM, Steve Lianoglou <lianoglou.st...@gene.com>wrote: > Or with plyr: > > R> library(plyr) > R> ans <- ddply(x, .(Date), function(df) df[which.max(df$Time),]) > > -steve > > On Wed, Aug 14, 2013 at 2:18 PM, Steve Lianoglou > <lianoglou.st...@gene.com> wrote: > > While we're playing code golf, likely faster still could be to use > > data.table. Assume your data is in a data.frame named "x": > > > > R> library(data.table) > > R> x <- data.table(x, key=c('Date', 'Time')) > > R> ans <- x[, .SD[.N], by='Date'] > > > > -steve > > > > On Wed, Aug 14, 2013 at 2:01 PM, William Dunlap <wdun...@tibco.com> > wrote: > >> A somewhat faster version (for datasets with lots of dates, assuming it > is sorted by date and time) is > >> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) > >> f3 <- function(dataFrame) { > >> dataFrame[ isLastInRun(dataFrame$Date), ] > >> } > >> where your two suggestions, as functions, are > >> f1 <- function (dataFrame) { > >> dataFrame[unlist(with(dataFrame, tapply(Time, list(Date), FUN = > function(x) x == max(x)))), ] > >> } > >> f2 <- function (dataFrame) { > >> dataFrame[cumsum(with(dataFrame, tapply(Time, list(Date), FUN = > which.max))), ] > >> } > >> > >> Bill Dunlap > >> Spotfire, TIBCO Software > >> wdunlap tibco.com > >> > >> > >>> -----Original Message----- > >>> From: r-help-boun...@r-project.org [mailto: > r-help-boun...@r-project.org] On Behalf > >>> Of arun > >>> Sent: Wednesday, August 14, 2013 1:08 PM > >>> To: Noah Silverman > >>> Cc: R help > >>> Subject: Re: [R] How to extract last value in each group > >>> > >>> Hi, > >>> Try: > >>> dat1<- read.table(text=" > >>> Date Time O H L C U D > >>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 > >>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 > >>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 > >>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > >>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 > >>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 > >>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 > >>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 > >>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 > >>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 > >>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > >>> ",sep="",header=TRUE,stringsAsFactors=FALSE) > >>> > >>> dat1[unlist(with(dat1,tapply(Time,list(Date),FUN=function(x) > x==max(x)))),] > >>> # Date Time O H L C U D > >>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > >>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > >>> #or > >>> dat1[cumsum(with(dat1,tapply(Time,list(Date),FUN=which.max))),] > >>> Date Time O H L C U D > >>> 4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > >>> 11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > >>> > >>> #or > >>> dat1[as.logical(with(dat1,ave(Time,Date,FUN=function(x) x==max(x)))),] > >>> # Date Time O H L C U D > >>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > >>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > >>> A.K. > >>> > >>> > >>> > >>> > >>> ----- Original Message ----- > >>> From: Noah Silverman <noahsilver...@ucla.edu> > >>> To: "R-help@r-project.org" <r-help@r-project.org> > >>> Cc: > >>> Sent: Wednesday, August 14, 2013 3:56 PM > >>> Subject: [R] How to extract last value in each group > >>> > >>> Hello, > >>> > >>> I have some stock pricing data for one minute intervals. > >>> > >>> The delivery format is a bit odd. The date column is easily parsed > and used as an index > >>> for an its object. However, the time column is just an integer > (1:1807) > >>> > >>> I just need to extract the *last* entry for each day. Don't actually > care what time it was, > >>> as long as it was the last one. > >>> > >>> Sure, writing a big nasty loop would work, but I was hoping that > someone would be able > >>> to suggest a faster way. > >>> > >>> Small snippet of data below my sig. > >>> > >>> Thanks! > >>> > >>> > >>> -- > >>> Noah Silverman, M.S., C.Phil > >>> UCLA Department of Statistics > >>> 8117 Math Sciences Building > >>> Los Angeles, CA 90095 > >>> > >>> > -------------------------------------------------------------------------- > >>> > >>> Date Time O H L C U D > >>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 > >>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 > >>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 > >>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > >>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 > >>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 > >>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 > >>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 > >>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 > >>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 > >>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 > >>> ______________________________________________ > >>> R-help@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >>> > >>> ______________________________________________ > >>> R-help@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Steve Lianoglou > > Computational Biologist > > Bioinformatics and Computational Biology > > Genentech > > > > -- > Steve Lianoglou > Computational Biologist > Bioinformatics and Computational Biology > Genentech > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.