On Aug 15, 2013, at 00:03 , David Winsemius wrote: > > On Aug 14, 2013, at 2:18 PM, Steve Lianoglou wrote: > >> While we're playing code golf, likely faster still could be to use >> data.table. Assume your data is in a data.frame named "x": >> >> R> library(data.table) >> R> x <- data.table(x, key=c('Date', 'Time')) >> R> ans <- x[, .SD[.N], by='Date'] > > I though code-golf was the most compact code: > > dat1[ tapply(rownames(dat1), dat1$Date, tail, 1) , ] > > Date Time O H L C U D > 4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 > 11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0
or even: > aggregate(dat1[-1], dat1[1], tail, 1) Date Time O H L C U D 1 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 2 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 (This relies on Date being the first col. For generality, I suppose you need > aggregate(dat1, dat1["Date"], tail, 1)[-1] Date Time O H L C U D 1 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 2 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 ) > >> >> -steve >> >> On Wed, Aug 14, 2013 at 2:01 PM, William Dunlap <wdun...@tibco.com> wrote: >>> A somewhat faster version (for datasets with lots of dates, assuming it is >>> sorted by date and time) is >>> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) >>> f3 <- function(dataFrame) { >>> dataFrame[ isLastInRun(dataFrame$Date), ] >>> } >>> where your two suggestions, as functions, are >>> f1 <- function (dataFrame) { >>> dataFrame[unlist(with(dataFrame, tapply(Time, list(Date), FUN = >>> function(x) x == max(x)))), ] >>> } >>> f2 <- function (dataFrame) { >>> dataFrame[cumsum(with(dataFrame, tapply(Time, list(Date), FUN = >>> which.max))), ] >>> } >>> >>> Bill Dunlap >>> Spotfire, TIBCO Software >>> wdunlap tibco.com >>> >>> >>>> -----Original Message----- >>>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >>>> On Behalf >>>> Of arun >>>> Sent: Wednesday, August 14, 2013 1:08 PM >>>> To: Noah Silverman >>>> Cc: R help >>>> Subject: Re: [R] How to extract last value in each group >>>> >>>> Hi, >>>> Try: >>>> dat1<- read.table(text=" >>>> Date Time O H L C U D >>>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 >>>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 >>>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 >>>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 >>>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 >>>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 >>>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 >>>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 >>>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 >>>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>>> ",sep="",header=TRUE,stringsAsFactors=FALSE) >>>> >>>> dat1[unlist(with(dat1,tapply(Time,list(Date),FUN=function(x) x==max(x)))),] >>>> # Date Time O H L C U D >>>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>>> #or >>>> dat1[cumsum(with(dat1,tapply(Time,list(Date),FUN=which.max))),] >>>> Date Time O H L C U D >>>> 4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>>> 11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>>> >>>> #or >>>> dat1[as.logical(with(dat1,ave(Time,Date,FUN=function(x) x==max(x)))),] >>>> # Date Time O H L C U D >>>> #4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>>> #11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>>> A.K. >>>> >>>> >>>> >>>> >>>> ----- Original Message ----- >>>> From: Noah Silverman <noahsilver...@ucla.edu> >>>> To: "R-help@r-project.org" <r-help@r-project.org> >>>> Cc: >>>> Sent: Wednesday, August 14, 2013 3:56 PM >>>> Subject: [R] How to extract last value in each group >>>> >>>> Hello, >>>> >>>> I have some stock pricing data for one minute intervals. >>>> >>>> The delivery format is a bit odd. The date column is easily parsed and >>>> used as an index >>>> for an its object. However, the time column is just an integer (1:1807) >>>> >>>> I just need to extract the *last* entry for each day. Don't actually care >>>> what time it was, >>>> as long as it was the last one. >>>> >>>> Sure, writing a big nasty loop would work, but I was hoping that someone >>>> would be able >>>> to suggest a faster way. >>>> >>>> Small snippet of data below my sig. >>>> >>>> Thanks! >>>> >>>> >>>> -- >>>> Noah Silverman, M.S., C.Phil >>>> UCLA Department of Statistics >>>> 8117 Math Sciences Building >>>> Los Angeles, CA 90095 >>>> >>>> -------------------------------------------------------------------------- >>>> >>>> Date Time O H L C U D >>>> 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12 >>>> 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6 >>>> 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7 >>>> 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0 >>>> 06/02/2010 331 136.55 136.70 136.50 136.70 36 6 >>>> 06/02/2010 332 136.70 136.70 136.65 136.65 3 1 >>>> 06/02/2010 334 136.75 136.75 136.75 136.75 1 0 >>>> 06/02/2010 335 136.80 136.80 136.80 136.80 4 0 >>>> 06/02/2010 336 136.80 136.80 136.80 136.80 8 0 >>>> 06/02/2010 337 136.75 136.80 136.75 136.80 1 2 >>>> 06/02/2010 338 136.80 136.80 136.80 136.80 3 0 >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> -- >> Steve Lianoglou >> Computational Biologist >> Bioinformatics and Computational Biology >> Genentech >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.