Hi: I'm a big fan of the reshape package, but this time I think that the doBy and plyr packages may better suit your needs. Since you mentioned wanting to get the min/mean/max of several variables simultaneously, I took out line54 and added some vectors of Gaussian(0, 1) random numbers for testing:
test <- data.frame(mF[, -5], x1 = rnorm(23), x2 = rnorm(23), x3 = rnorm(23)) ### doBy approach: # Create a function for doBy to use on a specific variable: f <- function(x) { c(min = min(x, na.rm = TRUE), mean = mean(x, na.rm = TRUE), max = max(x, na.rm = TRUE)) } library(doBy) > summaryBy(x1 + x2 + x3 ~ Season, data = test, FUN = f) Season x1.min x1.mean x1.max x2.min x2.mean x2.max 1 1 -1.108496 -0.2590727 1.692468 -0.8958644 -0.00485722 0.6525678 2 2 -1.686261 0.4655741 2.097220 -0.9484292 0.37197098 2.6325965 3 3 -1.093520 -0.2049273 0.390061 -0.6886613 0.49534667 2.4263802 x3.min x3.mean x3.max 1 -2.07369239 -0.05164301 1.6199843 2 -0.43556155 0.31221804 1.1939009 3 -0.04847558 0.15200570 0.4355102 The LHS of the formula consists of the variables you want summarized, the RHS contains the grouping variable(s), the data supplied MUST be a data frame and FUN is the function you want applied to each variable. In this case, the function returns a vector of the min, mean and max of the input variable. Notice that the names given in the function are appended to the variable name, separated by a dot. (A nice touch by the package author...) If you have a number of variables to summarize in this fashion, doBy is well designed for this type of task in the sense that the syntax is pretty straightforward. #### plyr approach To accomplish the same task in plyr with ddply(), you've got to be a little more clever - use numcolwise() in combination with each(). numcolwise() applies the same function to each numeric variable in the input data frame; each() applies the list of functions supplied as its arguments to a single input variable. The call below is a composition of the two functions: > ddply(test, .(Season), numcolwise(each(min, mean, max))) Season x1 x2 x3 1 1 -1.1084957 -0.89586438 -2.07369239 2 1 -0.2590727 -0.00485722 -0.05164301 3 1 1.6924681 0.65256782 1.61998433 4 2 -1.6862610 -0.94842919 -0.43556155 5 2 0.4655741 0.37197098 0.31221804 6 2 2.0972202 2.63259653 1.19390094 7 3 -1.0935199 -0.68866127 -0.04847558 8 3 -0.2049273 0.49534667 0.15200570 9 3 0.3900610 2.42638021 0.43551022 To distinguish the measures in each row, create a factor of stat names and then rearrange the order of columns to get something a little more presentable: > summ <- ddply(test, .(Season), numcolwise(each(min, mean, max))) > summ$stat <- rep(c('Min', 'Mean', 'Max'), 3) # add vector of names > summ <- summ[, c(1, 5, 2:4)] # column rearrangement > summ Season stat x1 x2 x3 1 1 Min -1.1084957 -0.89586438 -2.07369239 2 1 Mean -0.2590727 -0.00485722 -0.05164301 3 1 Max 1.6924681 0.65256782 1.61998433 4 2 Min -1.6862610 -0.94842919 -0.43556155 5 2 Mean 0.4655741 0.37197098 0.31221804 6 2 Max 2.0972202 2.63259653 1.19390094 7 3 Min -1.0935199 -0.68866127 -0.04847558 8 3 Mean -0.2049273 0.49534667 0.15200570 9 3 Max 0.3900610 2.42638021 0.43551022 The two functions give you two different ways to present the summaries; take your pick. HTH, Dennis On Wed, Apr 21, 2010 at 10:16 AM, Ben Stewart <bpste...@uvic.ca> wrote: > I've got a problem with the sparseby command (reshape library), and I have > reached the peak of my R knowledge (it isn't really that high). > > I have a small data frame of 23 rows and 15 columns, here is a subset, the > first four columns are factors and the rest are numeric (only one, line54 > is > provided). > > bearID YEAR Season SEX line54 > 5 1900 8 3 0 16.3923519 > 11 2270 5 1 0 233.7414014 > 12 2271 5 1 0 290.8207652 > 13 2271 5 2 0 244.7820844 > 15 2291 5 1 0 0.0000000 > 16 2291 5 2 0 14.5037795 > 17 2291 6 1 0 0.0000000 > 18 2293 5 2 0 144.7440752 > 19 2293 5 3 0 0.0000000 > 20 2293 6 1 0 16.0592270 > 21 2293 6 2 0 30.1383426 > 28 2298 5 1 0 0.9741067 > 29 2298 5 2 0 9.6641018 > 30 2298 6 2 0 8.6533828 > 31 2309 5 2 0 85.9781303 > 32 2325 6 1 0 110.8892153 > 35 2331 6 1 0 26.7335562 > 44 2390 7 2 0 7.1690620 > 45 2390 8 2 0 44.1109897 > 46 2390 8 3 0 503.9074898 > 47 2390 9 2 0 8.4393660 > 54 2416 7 3 0 48.6910907 > 58 2418 8 2 0 5.7951139 > > Sparseby works fine when I try to calculate mean > > >sparseby(mF[1:5], mF$Season, mean) > > mF$Season bearID YEAR Season SEX line54 > 1 1 NA NA NA 0 84.90228 > 2 2 NA NA NA 0 54.90713 > 3 3 NA NA NA 0 142.24773 > > But it goes nuts when looking for max or min > > > sparseby(mF[5:6], mF$Season, max) > mF$Season structure(c(2169.49621795108, 1885.22677689026, 2492.17544685464 > 1 1 > 2169.496 > 2 2 > 1885.227 > 3 3 > 2492.175 > > Any ideas? All I want is to calculate create three data.frames, mean, min > and max. > > Thanks, > > Ben Stewart > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.