Hi:

It's hard to diagnose the problem without an illustrative example. Perhaps
the following might help:

(1) When writing a function to use in ddply(), make a generic data frame the
input argument
     to the function and refer to the variables within the function either
with the $ notation
     or in relation to with(dataframe, ...). This is because you want to
apply the function to
     each sub-data frame indexed by combinations of the grouping factors.
(2) The function in (1) should return either a scalar quantity or a data
frame.
(3) If you're computing groupwise scalar summaries, make sure the third
argument of
      ddply() is summarise, as in
      ddply(mydf, .(grp1, grp2), summarise, mean = mean(y, na.rm = TRUE), sd
= sd(y, na,rm = TRUE))

I don't think as.data.frame.function(f) ... is going to work. Data frames
and functions are two quite different types of objects. If you're trying to
write a function that returns a data frame, then see point (2) above.

Here's an example with a few different versions of what is basically the
same function. Observe how they are handled in ddply().

mydf <- data.frame(grp1 = rep(LETTERS[1:3], each = 20),
                   grp2 = rep(rep(letters[1:2], each = 10), 3),
                      w = rpois(60, 10),
                      x = rpois(60, 5),
                      y = rbinom(60, 1, 0.5))

# One can use either with() to temporarily attach a data frame for the
# purpose of the calculation or use the $ notation to refers to components
# of a data frame. Either works, as shown below.
f <- function(df) {
    u <- with(df, (w + x)/2 + y)
    v <- df$x + df$w * df$y
    data.frame(u = u, v = v)
  }

# In this function, the reference to the data frame is never invoked.
h <- function(df) {
    u <- (w + x)/2 + y
    v <- x + w * y
    data.frame(u = u, v = v)
  }

# This returns both the original and newly created variables
g <- function(df) {
     df <- transform(df,
                      u = (w + x)/2 + y,
                      v = x + w * y
                    )
     df
  }

# Returns only the variables u and v + grouping variables; the originals x,
y, z are gone
ddply(mydf, .(grp1, grp2), f)
# Returns the original data frame; the new variables u and v are not added.
In this case,
# ddply silently ignores the function f
ddply(mydf, .(grp1, grp2), transform, f)
# This gets it right
ddply(mydf, .(grp1, grp2), g)
# What happens when you use variable names without accessing the referent
data frame
ddply(mydf, .(grp1, grp2), h)


HTH,
Dennis

On Wed, Apr 13, 2011 at 12:40 PM, 1Rnwb <sbpuro...@gmail.com> wrote:

> Hello all,
>
> I have arranged my data as per Dennis's suggestion in this post
> http://www.mail-archive.com/r-help@r-project.org/msg107156.html.
> the posted code works fine but when I try to apply it to my data, i get ">
> u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f))
> Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) :
>  grouping factor must have exactly 2 levels".
> Self_T1D has two levels "N" and "Y"
>
> I have used the ddply function to do the mean and sd for the same dataframe
> without any issues.
> I would appreciate help to solve this.
> Thanks
> Sharad
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/error-for-ttest-tp3448056p3448056.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to