Arg, the 'sapply(...)' in the function was in the initial comment, gm <- function(x, group){ # medians by group: sapply(split(x,group),median) but someone's mailer put a newline before the sapply gm <- function(x, group){ # medians by group: sapply(split(x,group),median) so it got executed, wasting all that time. (Of course the same mailer will mess up this message in the same way - just delete the sapply() call from gm().)
Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com > -----Original Message----- > From: hadley wickham [mailto:h.wick...@gmail.com] > Sent: Monday, January 05, 2009 9:10 AM > To: William Dunlap > Cc: gallon...@gmail.com; R help > Subject: Re: [R] the first and last observation for each subject > > > Another application of that technique can be used to quickly compute > > medians by groups: > > > > gm <- function(x, group){ # medians by group: > > sapply(split(x,group),median) > > o<-order(group, x) > > group <- group[o] > > x <- x[o] > > changes <- group[-1] != group[-length(group)] > > first <- which(c(TRUE, changes)) > > last <- which(c(changes, TRUE)) > > lowerMedian <- x[floor((first+last)/2)] > > upperMedian <- x[ceiling((first+last)/2)] > > median <- (lowerMedian+upperMedian)/2 > > names(median) <- group[first] > > median > > } > > > > For a 10^5 long x and a somewhat fewer than 3*10^4 distinct groups > > (in random order) the times are: > > > >> group<-sample(1:30000, size=100000, replace=TRUE) > >> x<-rnorm(length(group))*10 + group > >> unix.time(z0<-sapply(split(x,group), median)) > > user system elapsed > > 2.72 0.00 3.20 > >> unix.time(z1<-gm(x,group)) > > user system elapsed > > 0.12 0.00 0.16 > >> identical(z1,z0) > > [1] TRUE > > I get: > > > unix.time(z0<-sapply(split(x,group), median)) > user system elapsed > 2.733 0.017 2.766 > > unix.time(z1<-gm(x,group)) > user system elapsed > 2.897 0.032 2.946 > > > Hadley > > > -- > http://had.co.nz/ > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.