On Tue, 30 Nov 2010, Adaikalavan Ramasamy wrote:

Here is a possible solution using sweep instead of ave:

  df <- data.frame(site = c("a", "a", "a", "b", "b", "b"),
                   gr = c("total", "x1", "x2", "x1", "total","x2"),
                   value1 = c(212, 56, 87, 33, 456, 213),
                   value2 = c(1546, 560, 543, 234, 654, 312) )


lm() and friends provide a simple approach:


cbind( df, percent =
+   df[,-(1:2)] /
+     predict( lm( cbind(value1,value2) ~ gr*site, df),
+              new=data.frame(site=df$site,gr='total' ))
+ )
  site    gr value1 value2 percent.value1 percent.value2
1    a total    212   1546     1.00000000      1.0000000
2    a    x1     56    560     0.26415094      0.3622251
3    a    x2     87    543     0.41037736      0.3512290
4    b    x1     33    234     0.07236842      0.3577982
5    b total    456    654     1.00000000      1.0000000
6    b    x2    213    312     0.46710526      0.4770642


HTH,

Chuck

 sdf <- split(df, df$site)

 out <- lapply( sdf, function(mat){

    small.mat <- mat[ , -c(1,2)]
    totals    <- mat[ which( mat[ , "gr"] == "total" ), -c(1,2) ]
    totals    <- as.numeric(totals)

    percent=sweep( small.mat, MARGIN=2, STATS=totals, FUN="/" )
    colnames(percent) <- paste("percent_", colnames(percent), sep="")
    return( cbind(mat, percent) )
  } )

 do.call("rbind", out)

      site    gr value1 value2 percent_value1 percent_value2
  a.1    a total    212   1546     1.00000000      1.0000000
  a.2    a    x1     56    560     0.26415094      0.3622251
  a.3    a    x2     87    543     0.41037736      0.3512290
  b.4    b    x1     33    234     0.07236842      0.3577982
  b.5    b total    456    654     1.00000000      1.0000000
  b.6    b    x2    213    312     0.46710526      0.4770642

Also I think it might be more efficient to replace your "gr" variable with a binary 0,1 where 1 indicates the total. That way you don't have to generate x1, x2, x3, ....

Regards, Adai


On 30/11/2010 14:42, Patrick Hausmann wrote:
 Hi all,

 I would like to calculate the percent of the total per group for this
 data.frame:

 df<- data.frame(site = c("a", "a", "a", "b", "b", "b"),
                    gr = c("total", "x1", "x2", "x1", "total","x2"),
                    value1 = c(212, 56, 87, 33, 456, 213))
 df

 calcPercent<- function(df) {

       df<- transform(df, pct_val1 = ave(df[, -c(1:2)], df$gr,
                                 FUN = function(x)
                                 x/df[df$gr == "total", "value1"]) )
}

 # This works as intended...
 w<- lapply(split(df, df$site), calcPercent)
 w<- do.call(rbind, w)
 w

 # ... but when I add a new column
 df$value2<- c(1546, 560, 543, 234, 654, 312)

 # the result is not what I want...
 w<- lapply(split(df, df$site), calcPercent)
 w<- do.call(rbind, w)
 w

 Clearly I have to change the function, (particularly "value1") - but
 how... I've also played around with "apply" but without any success.

 Thanks for any help!
 Patrick

 ______________________________________________
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            Dept of Family/Preventive Medicine
cbe...@tajo.ucsd.edu                        UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to