Hi [EMAIL PROTECTED] napsal dne 17.09.2007 14:29:17:
> Dear All, > > I tried to aggregate the rows according to some factors in a data frame. > I got the > "Error in Summary.factor(..., na.rm = na.rm) : > sum not meaningful for factors" > message. This problem was once already discussed in 2003 on this list, > where the following solution was given: include only those columns -when > giving it to aggregate() - that are not factors. > > It also worked for me, but this solution is a bit odd, since there is no > need to sum the factors given as grouping variables. Of course I may do > something completely wrong. > help(aggregate) says: > ## S3 method for class 'data.frame': aggregate(x, by, FUN, ...) > |x| an R object. > |by| a list of grouping elements, each as long as the variables in |x|. > Names for the grouping variables are provided if they are not given. The > elements of the list will be coerced to factors (if they are not already > factors). > > In my interpretation this means that the factor variables and the > numeric variables are in the same data frame, namely x. > > The data frame looks like this (its mortality from cerebrovascular > diseases): > > str(agyer) > 'data.frame': 102 obs. of 65 variables: > $ Country : int 4055 4055 4055 4055 4055 4055 4055 4055 > 4055 4055 ... > $ Name : Factor w/ 5 levels "Estonia","Latvia",..: 1 1 1 > 1 1 1 1 1 1 1 ... > $ Year : int 1997 1997 1998 1999 1999 1999 2000 2000 > 2000 2001 ... > $ List : int 103 103 103 103 103 103 103 103 103 103 ... > $ Sex : int 2 1 2 2 1 2 2 1 1 2 ... > $ Morticd10_103_Frmat: int 1 1 1 1 1 1 1 1 1 1 ... > $ IM_Frmat : int 1 1 1 1 1 1 1 1 1 1 ... > $ Deaths1 : int 33 179 143 1428 83 61 3 759 29 4 ... > and a bunch of other int variables. > > After omitting agyer$Name, I do > > agyerpr=aggregate(agyer, by=list(agyer$Country, agyer$Year, > agyer$List, agyer$Sex, agyer$Morticd10_103_Frmat, agyer$IM_Frmat), sum) If this is the command you issued, it tries to aggregate the whole data frame agyer including a factor variable Name, hence the error. You want probably to sum only Deaths column based on values in other variables so you can do agyerpr <- with(agyer, aggregate(Deaths1, by=list(Country, Year,List,Sex, Morticd10_103_Frmat, IM_Frmat), sum)) Aggregate applies a function on each variable in R object, and if this variable is not conforming to the function it will result in error. If you want to omit some columns from aggregation just put agyer[, -c(column.numbers)] in x position of aggregate command. Regards Petr > > The sum is done on -the already omitted - factor of "Cause". > > I do not understand why it tries to sum a factor that is included in the > "by" list, since the concept is not to sum for those included, but use > them for grouping. I am lucky with this database because all the factors > can be interpreted as integers and I do not have to onit them one by > one, but what if not? > > Am I missing something with aggregate or classes? > > Thanks for your help! > > Sincerely, > Peter Mihalicza > > > > -- > This message has been scanned for viruses and\ dangerous con...{{dropped}} > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.