Re: [R] the first and last observation for each subject

2009-01-06 Thread William Dunlap
From: William Dunlap Sent: Monday, January 05, 2009 4:01 PM To: 'Kingsford Jones' Subject: RE: [R] the first and last observation for each subject > -Original Message- > From: Kingsford Jones [mailto:kingsfordjo...@gmail.com] > Sent: Monday, January 05, 2009 3:19

Re: [R] the first and last observation for each subject

2009-01-05 Thread Kingsford Jones
> > > > On Mon, Jan 5, 2009 at 10:18 AM, William Dunlap wrote: >> Arg, the 'sapply(...)' in the function was in the initial >> comment, >> gm <- function(x, group){ # medians by group: >> sapply(split(x,group),median) >> but someone&

Re: [R] the first and last observation for each subject

2009-01-05 Thread Kingsford Jones
the sapply() call from gm().) > > Bill Dunlap > TIBCO Software Inc - Spotfire Division > wdunlap tibco.com > >> -Original Message- >> From: hadley wickham [mailto:h.wick...@gmail.com] >> Sent: Monday, January 05, 2009 9:10 AM >> To: William Dunlap >&

Re: [R] the first and last observation for each subject

2009-01-05 Thread William Dunlap
[mailto:h.wick...@gmail.com] > Sent: Monday, January 05, 2009 9:10 AM > To: William Dunlap > Cc: gallon...@gmail.com; R help > Subject: Re: [R] the first and last observation for each subject > > > Another application of that technique can be used to quickly compute > &

Re: [R] the first and last observation for each subject

2009-01-05 Thread hadley wickham
> Another application of that technique can be used to quickly compute > medians by groups: > > gm <- function(x, group){ # medians by group: > sapply(split(x,group),median) > o<-order(group, x) > group <- group[o] > x <- x[o] > changes <- group[-1] != group[-length(group)] > first <- whi

Re: [R] the first and last observation for each subject

2009-01-05 Thread William Dunlap
> -Original Message- > From: hadley wickham [mailto:h.wick...@gmail.com] > Sent: Sunday, January 04, 2009 8:56 PM > To: William Dunlap > Cc: gallon...@gmail.com; R help > Subject: Re: [R] the first and last observation for each subject > > >> library(

Re: [R] the first and last observation for each subject

2009-01-04 Thread hadley wickham
>> easy to understand. Another approach is more specialized but useful >> when you have lots of ID's (e.g., millions) and speed is very important. >> It computes where the first and last entry for each ID in a vectorized >> computation, akin to the computation that rle() uses: > > I particularly t

Re: [R] the first and last observation for each subject

2009-01-04 Thread hadley wickham
>> library(plyr) >> >> # ddply is for splitting up data frames and combining the results >> # into a data frame. .(ID) says to split up the data frame by the > subject >> # variable >> ddply(DF, .(ID), function(one) with(one, y[length(y)] - y[1])) >> ... > > The above is much quicker than the vers

Re: [R] the first and last observation for each subject

2009-01-04 Thread William Dunlap
> [R] the first and last observation for each subject > hadley wickham h.wickham at gmail.com > Fri Jan 2 14:52:42 CET 2009 > > On Fri, Jan 2, 2009 at 3:20 AM, gallon li wrote: > > I have the following data > > > > ID x y time > > 1 10 20 0 > > 1 10

Re: [R] the first and last observation for each subject

2009-01-02 Thread Carlos J. Gil Bellosta
Hello, Is is truly y=max(y)-min(y) what you want below? Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com On Fri, 2009-01-02 at 13:16 -0500, Stavros Macrakis wrote: > I think there's a pretty simple solution here, though probably not the > most efficient: > > t(sapply(split(a

Re: [R] the first and last observation for each subject

2009-01-02 Thread Stavros Macrakis
I think there's a pretty simple solution here, though probably not the most efficient: t(sapply(split(a,a$ID), function(q) with(q,c(ID=unique(ID),x=unique(x),y=max(y)-min(y) Using 'unique' instead of min or [[1]] has the advantage that if x is in fact not time-invariant, this gives an err

Re: [R] the first and last observation for each subject

2009-01-02 Thread Frank E Harrell Jr
Here is a fast approach using the Hmisc package's summarize function. > g <- function(w) { + time <- w[,'time']; y <- w[,'y'] + c(y[which.min(time)], y[which.max(time)])} > > with(DF, summarize(DF, ID, g, stat.name=c('first','last'))) ID first last 1 120 40 2 223 38 3 310

Re: [R] the first and last observation for each subject

2009-01-02 Thread hadley wickham
On Fri, Jan 2, 2009 at 3:20 AM, gallon li wrote: > I have the following data > > ID x y time > 1 10 20 0 > 1 10 30 1 > 1 10 40 2 > 2 12 23 0 > 2 12 25 1 > 2 12 28 2 > 2 12 38 3 > 3 5 10 0 > 3 5 15 2 > . > > x is time invariant, ID is the subject id number, y is changing over time. > > I want

Re: [R] the first and last observation for each subject

2009-01-02 Thread Gabor Grothendieck
Try this: > Lines <- "ID x y time + 1 10 20 0 + 1 10 30 1 + 1 10 40 2 + 2 12 23 0 + 2 12 25 1 + 2 12 28 2 + 2 12 38 3 + 3 5 10 0 + 3 5 15 2" > DF <- read.table(textConnection(Lines), header = TRUE) > aggregate(DF[3], DF[1:2], function(x) tail(x, 1) - head(x, 1)) ID x y 1 3 5 5 2 1 10 20

Re: [R] the first and last observation for each subject

2009-01-02 Thread Jorge Ivan Velez
Dear Gallon, Assuming that your data is called "mydata", something like this should do the job: newdf<-data.frame( ID = unique(mydata$ID), x = unique(mydata$x), y = with(mydata,tapply(y,ID,function(m) tail(m,1)-head(m,1))) ) newdf HTH, Jorge On Fri, J

Re: [R] the first and last observation for each subject

2009-01-02 Thread Carlos J. Gil Bellosta
Hello, First, order your data by ID and time. The columns you want in your output dataframe are then unique(ID), tapply( x, ID, function( z ) z[ 1 ] ) and tapply( y, ID, function( z ) z[ lenght( z ) ] - z[ 1 ] ) Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com On Fri, 2009

[R] the first and last observation for each subject

2009-01-02 Thread gallon li
I have the following data ID x y time 1 10 20 0 1 10 30 1 1 10 40 2 2 12 23 0 2 12 25 1 2 12 28 2 2 12 38 3 3 5 10 0 3 5 15 2 . x is time invariant, ID is the subject id number, y is changing over time. I want to find out the difference between the first and last observed y value for each