Re: [R] aggregate, by, *apply

David Winsemius Wed, 15 Sep 2010 16:22:04 -0700


On Sep 15, 2010, at 5:45 PM, Mark Ebbert wrote:

Dear R gurus,
I regularly come across a situation where I would like to apply afunction to a subset of data in a dataframe, but I have not found anR function to facilitate exactly what I need. More specifically, I'dlike my function to have a context of where the data it's analyzingcame from. Here is an example:

### BEGIN ###
func<-function(x){
        m<-median(x$x)

        if(m > 2 & m < x$y){
                return(T)
        }
        return(F)
}

The semantic question is what are you trying to test when you say "m <x$y" ? "m" is a scalar and x is a vector. By default only the firstelement of x$y will be compared (not actually callable in that manner.)

tmp<-data.frame(x=1:10,y=c(rep(34,3),rep(35,3),rep(34,4)),z=c(rep("a",3),rep("b",3),rep("c",4)))
res<-aggregate(tmp,list(z),func)

I see Dennis has tried to move you forward to the plyr strategy, butsome of us are mired in the traditonal ways:


?split  # returns a dataframe in segments defined by a factor

> func<-function(x){
+       m<-median(x["x"], na.rm=TRUE)
+       if(m > 2 && m < x["y"]){
+               return(T)
+       }
+       return(F)
+ }
>

> tmp<-data.frame(x=1:10,y=c(rep(34,3),rep(35,3),rep(34,4)),z=c(rep("a",3),rep("b",3),rep("c",4)))

> res<-lapply(split(tmp,list(tmp$z)), func)
> res
$a
[1] FALSE

$b
[1] TRUE

$c
[1] TRUE

### END ###
The values in the example are trivial, but the problem is that onlyone column is passed to my function at a time, so I can't determinehow 'm' relates to 'x$y'. Any tips/guidance is appreciated.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate, by, *apply

Reply via email to