Or with plyr: R> library(plyr) R> ans <- ddply(x, .(Date), function(df) df[which.max(df$Time),])
-steve On Wed, Aug 14, 2013 at 2:18 PM, Steve Lianoglou <lianoglou.st...@gene.com> wrote: > While we're playing code golf, likely faster still could be to use > data.table. Assume your data is in a data.frame named "x": > > R> library(data.table) > R> x <- data.table(x, key=c('Date', 'Time')) > R> ans <- x[, .SD[.N], by='Date'] > > -steve > > On Wed, Aug 14, 2013 at 2:01 PM, William Dunlap <wdun...@tibco.com> wrote: >> A somewhat faster version (for datasets with lots of dates, assuming it is >> sorted by date and time) is >> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) >> f3 <- function(dataFrame) { >> dataFrame[ isLastInRun(dataFrame$Date), ] >> } >> where your two suggestions, as functions, are >> f1 <- function (dataFrame) { >> dataFrame[unlist(with(dataFrame, tapply(Time, list(Date), FUN = >> function(x) x == max(x)))), ] >> } >> f2 <- function (dataFrame) { >> dataFrame[cumsum(with(dataFrame, tapply(Time, list(Date), FUN = >> which.max))), ] >> } >> >> Bill Dunlap >> Spotfire, TIBCO Software >> wdunlap tibco.com >> >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >>> Behalf >>> Of arun >>> Sent: Wednesday, August 14, 2013 1:08 PM >>> To: Noah Silverman >>> Cc: R help >>> Subject: Re: [R] How to extract last value in each group >>> >>> Hi, >>> Try: >>> dat1<- read.table(text=" >>> Date Time O H L C U D >>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 >>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 >>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 >>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 >>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 >>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 >>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 >>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 >>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 >>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>> ",sep="",header=TRUE,stringsAsFactors=FALSE) >>> >>> dat1[unlist(with(dat1,tapply(Time,list(Date),FUN=function(x) x==max(x)))),] >>> # Date Time O H L C U D >>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>> #or >>> dat1[cumsum(with(dat1,tapply(Time,list(Date),FUN=which.max))),] >>> Date Time O H L C U D >>> 4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>> 11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>> >>> #or >>> dat1[as.logical(with(dat1,ave(Time,Date,FUN=function(x) x==max(x)))),] >>> # Date Time O H L C U D >>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>> A.K. >>> >>> >>> >>> >>> ----- Original Message ----- >>> From: Noah Silverman <noahsilver...@ucla.edu> >>> To: "R-help@r-project.org" <r-help@r-project.org> >>> Cc: >>> Sent: Wednesday, August 14, 2013 3:56 PM >>> Subject: [R] How to extract last value in each group >>> >>> Hello, >>> >>> I have some stock pricing data for one minute intervals. >>> >>> The delivery format is a bit odd. The date column is easily parsed and >>> used as an index >>> for an its object. However, the time column is just an integer (1:1807) >>> >>> I just need to extract the *last* entry for each day. Don't actually care >>> what time it was, >>> as long as it was the last one. >>> >>> Sure, writing a big nasty loop would work, but I was hoping that someone >>> would be able >>> to suggest a faster way. >>> >>> Small snippet of data below my sig. >>> >>> Thanks! >>> >>> >>> -- >>> Noah Silverman, M.S., C.Phil >>> UCLA Department of Statistics >>> 8117 Math Sciences Building >>> Los Angeles, CA 90095 >>> >>> -------------------------------------------------------------------------- >>> >>> Date Time O H L C U D >>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 >>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 >>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 >>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 >>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 >>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 >>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 >>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 >>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 >>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Steve Lianoglou > Computational Biologist > Bioinformatics and Computational Biology > Genentech -- Steve Lianoglou Computational Biologist Bioinformatics and Computational Biology Genentech ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.