Hi: It's hard to diagnose the problem without an illustrative example. Perhaps the following might help:
(1) When writing a function to use in ddply(), make a generic data frame the input argument to the function and refer to the variables within the function either with the $ notation or in relation to with(dataframe, ...). This is because you want to apply the function to each sub-data frame indexed by combinations of the grouping factors. (2) The function in (1) should return either a scalar quantity or a data frame. (3) If you're computing groupwise scalar summaries, make sure the third argument of ddply() is summarise, as in ddply(mydf, .(grp1, grp2), summarise, mean = mean(y, na.rm = TRUE), sd = sd(y, na,rm = TRUE)) I don't think as.data.frame.function(f) ... is going to work. Data frames and functions are two quite different types of objects. If you're trying to write a function that returns a data frame, then see point (2) above. Here's an example with a few different versions of what is basically the same function. Observe how they are handled in ddply(). mydf <- data.frame(grp1 = rep(LETTERS[1:3], each = 20), grp2 = rep(rep(letters[1:2], each = 10), 3), w = rpois(60, 10), x = rpois(60, 5), y = rbinom(60, 1, 0.5)) # One can use either with() to temporarily attach a data frame for the # purpose of the calculation or use the $ notation to refers to components # of a data frame. Either works, as shown below. f <- function(df) { u <- with(df, (w + x)/2 + y) v <- df$x + df$w * df$y data.frame(u = u, v = v) } # In this function, the reference to the data frame is never invoked. h <- function(df) { u <- (w + x)/2 + y v <- x + w * y data.frame(u = u, v = v) } # This returns both the original and newly created variables g <- function(df) { df <- transform(df, u = (w + x)/2 + y, v = x + w * y ) df } # Returns only the variables u and v + grouping variables; the originals x, y, z are gone ddply(mydf, .(grp1, grp2), f) # Returns the original data frame; the new variables u and v are not added. In this case, # ddply silently ignores the function f ddply(mydf, .(grp1, grp2), transform, f) # This gets it right ddply(mydf, .(grp1, grp2), g) # What happens when you use variable names without accessing the referent data frame ddply(mydf, .(grp1, grp2), h) HTH, Dennis On Wed, Apr 13, 2011 at 12:40 PM, 1Rnwb <sbpuro...@gmail.com> wrote: > Hello all, > > I have arranged my data as per Dennis's suggestion in this post > http://www.mail-archive.com/r-help@r-project.org/msg107156.html. > the posted code works fine but when I try to apply it to my data, i get "> > u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f)) > Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) : > grouping factor must have exactly 2 levels". > Self_T1D has two levels "N" and "Y" > > I have used the ddply function to do the mean and sd for the same dataframe > without any issues. > I would appreciate help to solve this. > Thanks > Sharad > > -- > View this message in context: > http://r.789695.n4.nabble.com/error-for-ttest-tp3448056p3448056.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.