Re: [R] column selection for aggregate()

Ivan Calandra Tue, 19 Jan 2010 01:37:19 -0800

Not really, I tried without select = - c(MEASUREM, SEL_FACET, SEL_MEAS) 
and indeed the mean was not computed, but it still appeared in the data, 
which I didn't want.


Thanks a lot for your help
Ivan


Gabor Grothendieck a écrit :
> It looks ok except you have both specified the wanted factors and
> removed the undesired factors from the data frame.  You only need to
> do one of these as in the example I gave, not both, so the solution
> could be simpler.
>
> On Mon, Jan 18, 2010 at 11:19 AM, Ivan Calandra
> <ivan.calan...@uni-hamburg.de> wrote:
>   
>> Hi!
>>
>> It looks like it works perfectly.
>> However, since I cannot check whether I get the good result or not, can you
>> please let me know if you see any mistakes?
>>
>> Here is the code:
>> ssfamean <- summaryBy(.~SPECSHOR+BONE+TO_POS+FACETTE+SHEARFAC+ENA_BA, data =
>> subset(ssfa, select = - c(MEASUREM, SEL_FACET, SEL_MEAS)), FUN=mean)
>>
>> That should give me the mean for all numerical variables grouped by
>> SPECSHOR+BONE+TO_POS+FACETTE+SHEARFAC+ENA_BA (i.e. the mean of the rows with
>> equal values for all these variables) on the data file ssfa without the
>> columns for MEASUREM, SEL_FACET, SEL_MEAS, right?
>>
>> Sorry to ask such stupid question, but this line will give me the data I
>> have to analyze, I cannot afford to make any mistake here (nowhere of
>> course, but here I cannot really check).
>>
>> Thanks in advance
>> Ivan
>>
>>
>> Gabor Grothendieck a écrit :
>>
>> Try summaryBy in the doBy package. e.g. using the built-in CO2
>> summarize each numeric variable by each factor except for the factors
>> Plant and Type:
>>
>> library(doBy)
>> summaryBy(. ~ ., data = subset(CO2, select = - c(Plant, Type)))
>>
>>
>> On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra
>> <ivan.calan...@uni-hamburg.de> wrote:
>>
>>
>> Hi everybody!
>>
>> I'm working on R today so I have a lot of questions (you may have
>> noticed that it's the 3rd email today). I'm new on R, so please excuse
>> the "spam"!
>>
>> I have a dataset "ssfa" with many rows and the column names are:
>>  > names(ssfa)
>>  [1] "SPECSHOR"  "BONE"      "TO_POS"    "MEASUREM"  "FACETTE"   "SHEARFAC"
>>  [7] "ENA_BA"    "SEL_FACET" "SEL_MEAS"  "Asfc"      "Smc"       "epLsar"
>> [13] "HAsfc4"    "HAsfc9"    "HAsfc16"   "HAsfc25"   "HAsfc36"   "HAsfc49"
>> [19] "HAsfc64"   "HAsfc81"   "HAsfc100"  "HAsfc121"  "Tfv"       "Ftfv"
>>
>> I want to aggregate that way:
>> ssfamean <- aggregate(ssfa[c("Asfc", "Smc", "epLsar", "HAsfc4",
>> "HAsfc9", "HAsfc16", "HAsfc25", "HAsfc36", "HAsfc49", "HAsfc64",
>> "HAsfc81", "HAsfc100", "HAsfc121", "Tfv", "Ftfv")], ssfa[c("SPECSHOR",
>> "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean).
>>
>> As you can see, it is very long since I have many variables. Basically I
>> want to select all numerical variables (10 to 24), and all categorical
>> variables except MEASUREM, SEL_FACET and SEL_MEAS without having to
>> write each of them. I would also like to avoid writing the names, the
>> indexes would be nice.
>> I tried with:
>>  > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])],
>> ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")],
>> mean)
>> but it obviously doesn't work (well "obviously"...)
>>
>> Could anyone help me on this?
>> Thanks in advance
>> Ivan
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>     
>
>   

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] column selection for aggregate()

Reply via email to