Or how about rle/cumsum, as per the appended?

-- Mike

> myData <- read.table("junk.dat", header=TRUE, stringsAsFactors=FALSE)
> myData
Date Time O H L C U D
1 06/01/2010 1358 136.40 136.40 136.35 136.35 2 12
2 06/01/2010 1359 136.40 136.50 136.35 136.50 9 6
3 06/01/2010 1400 136.45 136.55 136.35 136.40 8 7
4 06/01/2010 1700 136.55 136.55 136.55 136.55 1 0
5 06/02/2010 331 136.55 136.70 136.50 136.70 36 6
6 06/02/2010 332 136.70 136.70 136.65 136.65 3 1
7 06/02/2010 334 136.75 136.75 136.75 136.75 1 0
8 06/02/2010 335 136.80 136.80 136.80 136.80 4 0
9 06/02/2010 336 136.80 136.80 136.80 136.80 8 0
10 06/02/2010 337 136.75 136.80 136.75 136.80 1 2
11 06/02/2010 338 136.80 136.80 136.80 136.80 3 0
> entriesPerDay <- rle(myData[ , 1])
> myData[cumsum(entriesPerDay$lengths), 2]
[1] 1700 338
>


On Wed, Aug 14, 2013 at 2:22 PM, Steve Lianoglou
<lianoglou.st...@gene.com>wrote:

> Or with plyr:
>
> R> library(plyr)
> R> ans <- ddply(x, .(Date), function(df) df[which.max(df$Time),])
>
> -steve
>
> On Wed, Aug 14, 2013 at 2:18 PM, Steve Lianoglou
> <lianoglou.st...@gene.com> wrote:
> > While we're playing code golf, likely faster still could be to use
> > data.table. Assume your data is in a data.frame named "x":
> >
> > R> library(data.table)
> > R> x <- data.table(x, key=c('Date', 'Time'))
> > R> ans <- x[, .SD[.N], by='Date']
> >
> > -steve
> >
> > On Wed, Aug 14, 2013 at 2:01 PM, William Dunlap <wdun...@tibco.com>
> wrote:
> >> A somewhat faster version (for datasets with lots of dates, assuming it
> is sorted by date and time) is
> >>   isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
> >>   f3 <- function(dataFrame) {
> >>       dataFrame[ isLastInRun(dataFrame$Date), ]
> >>   }
> >> where your two suggestions, as functions, are
> >>   f1 <- function (dataFrame) {
> >>       dataFrame[unlist(with(dataFrame, tapply(Time, list(Date), FUN =
> function(x) x == max(x)))), ]
> >>   }
> >>   f2 <- function (dataFrame) {
> >>       dataFrame[cumsum(with(dataFrame, tapply(Time, list(Date), FUN =
> which.max))), ]
> >>   }
> >>
> >> Bill Dunlap
> >> Spotfire, TIBCO Software
> >> wdunlap tibco.com
> >>
> >>
> >>> -----Original Message-----
> >>> From: r-help-boun...@r-project.org [mailto:
> r-help-boun...@r-project.org] On Behalf
> >>> Of arun
> >>> Sent: Wednesday, August 14, 2013 1:08 PM
> >>> To: Noah Silverman
> >>> Cc: R help
> >>> Subject: Re: [R] How to extract last value in each group
> >>>
> >>> Hi,
> >>> Try:
> >>> dat1<- read.table(text="
> >>>         Date Time      O      H      L      C  U  D
> >>> 06/01/2010 1358 136.40 136.40 136.35 136.35  2  12
> >>> 06/01/2010 1359 136.40 136.50 136.35 136.50  9  6
> >>> 06/01/2010 1400 136.45 136.55 136.35 136.40  8  7
> >>> 06/01/2010 1700 136.55 136.55 136.55 136.55  1  0
> >>> 06/02/2010  331 136.55 136.70 136.50 136.70  36  6
> >>> 06/02/2010  332 136.70 136.70 136.65 136.65  3  1
> >>> 06/02/2010  334 136.75 136.75 136.75 136.75  1  0
> >>> 06/02/2010  335 136.80 136.80 136.80 136.80  4  0
> >>> 06/02/2010  336 136.80 136.80 136.80 136.80  8  0
> >>> 06/02/2010  337 136.75 136.80 136.75 136.80  1  2
> >>> 06/02/2010  338 136.80 136.80 136.80 136.80  3  0
> >>> ",sep="",header=TRUE,stringsAsFactors=FALSE)
> >>>
> >>>  dat1[unlist(with(dat1,tapply(Time,list(Date),FUN=function(x)
> x==max(x)))),]
> >>> #         Date Time      O      H      L      C U D
> >>> #4  06/01/2010 1700 136.55 136.55 136.55 136.55 1 0
> >>> #11 06/02/2010  338 136.80 136.80 136.80 136.80 3 0
> >>> #or
> >>>  dat1[cumsum(with(dat1,tapply(Time,list(Date),FUN=which.max))),]
> >>>          Date Time      O      H      L      C U D
> >>> 4  06/01/2010 1700 136.55 136.55 136.55 136.55 1 0
> >>> 11 06/02/2010  338 136.80 136.80 136.80 136.80 3 0
> >>>
> >>> #or
> >>> dat1[as.logical(with(dat1,ave(Time,Date,FUN=function(x) x==max(x)))),]
> >>>  #        Date Time      O      H      L      C U D
> >>> #4  06/01/2010 1700 136.55 136.55 136.55 136.55 1 0
> >>> #11 06/02/2010  338 136.80 136.80 136.80 136.80 3 0
> >>> A.K.
> >>>
> >>>
> >>>
> >>>
> >>> ----- Original Message -----
> >>> From: Noah Silverman <noahsilver...@ucla.edu>
> >>> To: "R-help@r-project.org" <r-help@r-project.org>
> >>> Cc:
> >>> Sent: Wednesday, August 14, 2013 3:56 PM
> >>> Subject: [R] How to extract last value in each group
> >>>
> >>> Hello,
> >>>
> >>> I have some stock pricing data for one minute intervals.
> >>>
> >>> The delivery format is a bit odd.  The date column is easily parsed
> and used as an index
> >>> for an its object.  However, the time column is just an integer
> (1:1807)
> >>>
> >>> I just need to extract the *last* entry for each day.  Don't actually
> care what time it was,
> >>> as long as it was the last one.
> >>>
> >>> Sure, writing a big nasty loop would work, but I was hoping that
> someone would be able
> >>> to suggest a faster way.
> >>>
> >>> Small snippet of data below my sig.
> >>>
> >>> Thanks!
> >>>
> >>>
> >>> --
> >>> Noah Silverman, M.S., C.Phil
> >>> UCLA Department of Statistics
> >>> 8117 Math Sciences Building
> >>> Los Angeles, CA 90095
> >>>
> >>>
> --------------------------------------------------------------------------
> >>>
> >>>         Date Time      O      H      L      C  U  D
> >>> 06/01/2010 1358 136.40 136.40 136.35 136.35   2  12
> >>> 06/01/2010 1359 136.40 136.50 136.35 136.50   9   6
> >>> 06/01/2010 1400 136.45 136.55 136.35 136.40   8   7
> >>> 06/01/2010 1700 136.55 136.55 136.55 136.55   1   0
> >>> 06/02/2010  331 136.55 136.70 136.50 136.70  36   6
> >>> 06/02/2010  332 136.70 136.70 136.65 136.65   3   1
> >>> 06/02/2010  334 136.75 136.75 136.75 136.75   1   0
> >>> 06/02/2010  335 136.80 136.80 136.80 136.80   4   0
> >>> 06/02/2010  336 136.80 136.80 136.80 136.80   8   0
> >>> 06/02/2010  337 136.75 136.80 136.75 136.80   1   2
> >>> 06/02/2010  338 136.80 136.80 136.80 136.80   3   0
> >>> ______________________________________________
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >>> ______________________________________________
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Steve Lianoglou
> > Computational Biologist
> > Bioinformatics and Computational Biology
> > Genentech
>
>
>
> --
> Steve Lianoglou
> Computational Biologist
> Bioinformatics and Computational Biology
> Genentech
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to