From: William Dunlap
Sent: Monday, January 05, 2009 4:01 PM
To: 'Kingsford Jones'
Subject: RE: [R] the first and last observation for each subject
> -Original Message-
> From: Kingsford Jones [mailto:kingsfordjo...@gmail.com]
> Sent: Monday, January 05, 2009 3:19
>
>
>
> On Mon, Jan 5, 2009 at 10:18 AM, William Dunlap wrote:
>> Arg, the 'sapply(...)' in the function was in the initial
>> comment,
>> gm <- function(x, group){ # medians by group:
>> sapply(split(x,group),median)
>> but someone&
the sapply() call from gm().)
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
>> -Original Message-
>> From: hadley wickham [mailto:h.wick...@gmail.com]
>> Sent: Monday, January 05, 2009 9:10 AM
>> To: William Dunlap
>&
[mailto:h.wick...@gmail.com]
> Sent: Monday, January 05, 2009 9:10 AM
> To: William Dunlap
> Cc: gallon...@gmail.com; R help
> Subject: Re: [R] the first and last observation for each subject
>
> > Another application of that technique can be used to quickly compute
> &
> Another application of that technique can be used to quickly compute
> medians by groups:
>
> gm <- function(x, group){ # medians by group:
> sapply(split(x,group),median)
> o<-order(group, x)
> group <- group[o]
> x <- x[o]
> changes <- group[-1] != group[-length(group)]
> first <- whi
> -Original Message-
> From: hadley wickham [mailto:h.wick...@gmail.com]
> Sent: Sunday, January 04, 2009 8:56 PM
> To: William Dunlap
> Cc: gallon...@gmail.com; R help
> Subject: Re: [R] the first and last observation for each subject
>
> >> library(
>> easy to understand. Another approach is more specialized but useful
>> when you have lots of ID's (e.g., millions) and speed is very important.
>> It computes where the first and last entry for each ID in a vectorized
>> computation, akin to the computation that rle() uses:
>
> I particularly t
>> library(plyr)
>>
>> # ddply is for splitting up data frames and combining the results
>> # into a data frame. .(ID) says to split up the data frame by the
> subject
>> # variable
>> ddply(DF, .(ID), function(one) with(one, y[length(y)] - y[1]))
>> ...
>
> The above is much quicker than the vers
> [R] the first and last observation for each subject
> hadley wickham h.wickham at gmail.com
> Fri Jan 2 14:52:42 CET 2009
>
> On Fri, Jan 2, 2009 at 3:20 AM, gallon li
wrote:
> > I have the following data
> >
> > ID x y time
> > 1 10 20 0
> > 1 10
Hello,
Is is truly
y=max(y)-min(y)
what you want below?
Best regards,
Carlos J. Gil Bellosta
http://www.datanalytics.com
On Fri, 2009-01-02 at 13:16 -0500, Stavros Macrakis wrote:
> I think there's a pretty simple solution here, though probably not the
> most efficient:
>
> t(sapply(split(a
I think there's a pretty simple solution here, though probably not the
most efficient:
t(sapply(split(a,a$ID),
function(q) with(q,c(ID=unique(ID),x=unique(x),y=max(y)-min(y)
Using 'unique' instead of min or [[1]] has the advantage that if x is
in fact not time-invariant, this gives an err
Here is a fast approach using the Hmisc package's summarize function.
> g <- function(w) {
+ time <- w[,'time']; y <- w[,'y']
+ c(y[which.min(time)], y[which.max(time)])}
>
> with(DF, summarize(DF, ID, g, stat.name=c('first','last')))
ID first last
1 120 40
2 223 38
3 310
On Fri, Jan 2, 2009 at 3:20 AM, gallon li wrote:
> I have the following data
>
> ID x y time
> 1 10 20 0
> 1 10 30 1
> 1 10 40 2
> 2 12 23 0
> 2 12 25 1
> 2 12 28 2
> 2 12 38 3
> 3 5 10 0
> 3 5 15 2
> .
>
> x is time invariant, ID is the subject id number, y is changing over time.
>
> I want
Try this:
> Lines <- "ID x y time
+ 1 10 20 0
+ 1 10 30 1
+ 1 10 40 2
+ 2 12 23 0
+ 2 12 25 1
+ 2 12 28 2
+ 2 12 38 3
+ 3 5 10 0
+ 3 5 15 2"
> DF <- read.table(textConnection(Lines), header = TRUE)
> aggregate(DF[3], DF[1:2], function(x) tail(x, 1) - head(x, 1))
ID x y
1 3 5 5
2 1 10 20
Dear Gallon,
Assuming that your data is called "mydata", something like this should do
the job:
newdf<-data.frame(
ID = unique(mydata$ID),
x = unique(mydata$x),
y = with(mydata,tapply(y,ID,function(m) tail(m,1)-head(m,1)))
)
newdf
HTH,
Jorge
On Fri, J
Hello,
First, order your data by ID and time.
The columns you want in your output dataframe are then
unique(ID),
tapply( x, ID, function( z ) z[ 1 ] )
and
tapply( y, ID, function( z ) z[ lenght( z ) ] - z[ 1 ] )
Best regards,
Carlos J. Gil Bellosta
http://www.datanalytics.com
On Fri, 2009
I have the following data
ID x y time
1 10 20 0
1 10 30 1
1 10 40 2
2 12 23 0
2 12 25 1
2 12 28 2
2 12 38 3
3 5 10 0
3 5 15 2
.
x is time invariant, ID is the subject id number, y is changing over time.
I want to find out the difference between the first and last observed y
value for each
17 matches
Mail list logo