On Jan 30, 2010, at 4:46 PM, david hilton shanabrook wrote:


On 30 Jan 2010, at 4:20 PM, David Winsemius wrote:


On Jan 30, 2010, at 4:09 PM, david hilton shanabrook wrote:

I have a data frame with two columns, a factor and a numeric. I want to create data frame with the factor, its frequency and the median of the numeric column
head(motifList)
events     score
1  aeijm -0.25000000
2  begjm -0.25000000
3  afgjm -0.25000000
4  afhjm -0.25000000
5  aeijm -0.25000000
6  aehjm  0.08333333

To get the frequency table of events:

motifTable <- as.data.frame(table(motifList$events))
head(motifTable)
Var1 Freq
1 aeijm  110
2 begjm   46
3 afgjm  337
4 afhjm  102
5 aehjm  190
6 adijm   18


Now get the score column back in.

motifTable2 <- merge(motifList, motifTable, by="events")
head(motifTable2)
events     percent freq
1  adgjm  0.00000000  111
2  adgjm          NA  111
3  adgjm  0.13333333  111
4  adgjm  0.06666667  111
5  adgjm -0.16666667  111
6  adgjm          NA  111


Then lastly to aggregate on the events column getting the median of the score
motifTable3 <- aggregate.data.frame(motifTable2, by=list(motifTable2$events), FUN=median, na.rm=TRUE)
Error in median.default(X[[1L]], ...) : need numeric data

Which gives the error as events are a factor. Can someone enlighten me to a more obvious approach?

I don't think grouping on a factor is the source of your error. You have NA's in your data and median will choke on those unless you specify na.rm=TRUE.

--

I thought the na.rm=TRUE in the aggregate function would do this (see above). I also tried it with

I missed that.
medianRmNa <- function(data) {
        return(median(data, na.rm=TRUE))}

motifTable3 <- aggregate.data.frame(motifTable2, by=list(motifTable2$events), FUN=medianRmNa)
Error in median.default(data, na.rm = TRUE) : need numeric data

Apparently you cannot include the grouping variable in the first argument to aggregate:

motifTable3 <- aggregate(motifTable2[ , -1], by=list(motifTable2$events), FUN=median, na.rm=TRUE)

> motifTable3
  Group.1       score freq
1   aehjm  0.08333333    1
2   aeijm -0.25000000    2
3   afgjm -0.25000000    1
4   afhjm -0.25000000    1
5   begjm -0.25000000    1



same error.

I did leave a line out of the above script,

names(motifTable) <- c("events", "freq")
which helps explain why the merge works

dhs


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to