Re: [R] column selection for aggregate()

Ivan Calandra Mon, 18 Jan 2010 08:30:08 -0800

Hi!

It looks like it works perfectly.
However, since I cannot check whether I get the good result or not, can 
you please let me know if you see any mistakes?


Here is the code:
ssfamean <- summaryBy(.~SPECSHOR+BONE+TO_POS+FACETTE+SHEARFAC+ENA_BA, 
data = subset(ssfa, select = - c(MEASUREM, SEL_FACET, SEL_MEAS)), FUN=mean)

That should give me the mean for all numerical variables grouped by 
SPECSHOR+BONE+TO_POS+FACETTE+SHEARFAC+ENA_BA (i.e. *the mean of the rows 
with equal values for all these variables*) on the data file ssfa 
without the columns for MEASUREM, SEL_FACET, SEL_MEAS, right?

Sorry to ask such stupid question, but this line will give me the data I 
have to analyze, I cannot afford to make any mistake here (nowhere of 
course, but here I cannot really check).

Thanks in advance
Ivan


Gabor Grothendieck a écrit :
> Try summaryBy in the doBy package. e.g. using the built-in CO2
> summarize each numeric variable by each factor except for the factors
> Plant and Type:
>
> library(doBy)
> summaryBy(. ~ ., data = subset(CO2, select = - c(Plant, Type)))
>
>
> On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra
> <ivan.calan...@uni-hamburg.de> wrote:
>   
>> Hi everybody!
>>
>> I'm working on R today so I have a lot of questions (you may have
>> noticed that it's the 3rd email today). I'm new on R, so please excuse
>> the "spam"!
>>
>> I have a dataset "ssfa" with many rows and the column names are:
>>  > names(ssfa)
>>  [1] "SPECSHOR"  "BONE"      "TO_POS"    "MEASUREM"  "FACETTE"   "SHEARFAC"
>>  [7] "ENA_BA"    "SEL_FACET" "SEL_MEAS"  "Asfc"      "Smc"       "epLsar"
>> [13] "HAsfc4"    "HAsfc9"    "HAsfc16"   "HAsfc25"   "HAsfc36"   "HAsfc49"
>> [19] "HAsfc64"   "HAsfc81"   "HAsfc100"  "HAsfc121"  "Tfv"       "Ftfv"
>>
>> I want to aggregate that way:
>> ssfamean <- aggregate(ssfa[c("Asfc", "Smc", "epLsar", "HAsfc4",
>> "HAsfc9", "HAsfc16", "HAsfc25", "HAsfc36", "HAsfc49", "HAsfc64",
>> "HAsfc81", "HAsfc100", "HAsfc121", "Tfv", "Ftfv")], ssfa[c("SPECSHOR",
>> "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean).
>>
>> As you can see, it is very long since I have many variables. Basically I
>> want to select all numerical variables (10 to 24), and all categorical
>> variables except MEASUREM, SEL_FACET and SEL_MEAS without having to
>> write each of them. I would also like to avoid writing the names, the
>> indexes would be nice.
>> I tried with:
>>  > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])],
>> ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")],
>> mean)
>> but it obviously doesn't work (well "obviously"...)
>>
>> Could anyone help me on this?
>> Thanks in advance
>> Ivan
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>     
>
>   

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] column selection for aggregate()

Reply via email to