On Tue, 30 Nov 2010, Adaikalavan Ramasamy wrote:
Here is a possible solution using sweep instead of ave:
df <- data.frame(site = c("a", "a", "a", "b", "b", "b"),
gr = c("total", "x1", "x2", "x1", "total","x2"),
value1 = c(212, 56, 87, 33, 456, 213),
value2 = c(1546, 560, 543, 234, 654, 312) )
lm() and friends provide a simple approach:
cbind( df, percent =
+ df[,-(1:2)] /
+ predict( lm( cbind(value1,value2) ~ gr*site, df),
+ new=data.frame(site=df$site,gr='total' ))
+ )
site gr value1 value2 percent.value1 percent.value2
1 a total 212 1546 1.00000000 1.0000000
2 a x1 56 560 0.26415094 0.3622251
3 a x2 87 543 0.41037736 0.3512290
4 b x1 33 234 0.07236842 0.3577982
5 b total 456 654 1.00000000 1.0000000
6 b x2 213 312 0.46710526 0.4770642
HTH,
Chuck
sdf <- split(df, df$site)
out <- lapply( sdf, function(mat){
small.mat <- mat[ , -c(1,2)]
totals <- mat[ which( mat[ , "gr"] == "total" ), -c(1,2) ]
totals <- as.numeric(totals)
percent=sweep( small.mat, MARGIN=2, STATS=totals, FUN="/" )
colnames(percent) <- paste("percent_", colnames(percent), sep="")
return( cbind(mat, percent) )
} )
do.call("rbind", out)
site gr value1 value2 percent_value1 percent_value2
a.1 a total 212 1546 1.00000000 1.0000000
a.2 a x1 56 560 0.26415094 0.3622251
a.3 a x2 87 543 0.41037736 0.3512290
b.4 b x1 33 234 0.07236842 0.3577982
b.5 b total 456 654 1.00000000 1.0000000
b.6 b x2 213 312 0.46710526 0.4770642
Also I think it might be more efficient to replace your "gr" variable with a
binary 0,1 where 1 indicates the total. That way you don't have to generate
x1, x2, x3, ....
Regards, Adai
On 30/11/2010 14:42, Patrick Hausmann wrote:
Hi all,
I would like to calculate the percent of the total per group for this
data.frame:
df<- data.frame(site = c("a", "a", "a", "b", "b", "b"),
gr = c("total", "x1", "x2", "x1", "total","x2"),
value1 = c(212, 56, 87, 33, 456, 213))
df
calcPercent<- function(df) {
df<- transform(df, pct_val1 = ave(df[, -c(1:2)], df$gr,
FUN = function(x)
x/df[df$gr == "total", "value1"]) )
}
# This works as intended...
w<- lapply(split(df, df$site), calcPercent)
w<- do.call(rbind, w)
w
# ... but when I add a new column
df$value2<- c(1546, 560, 543, 234, 654, 312)
# the result is not what I want...
w<- lapply(split(df, df$site), calcPercent)
w<- do.call(rbind, w)
w
Clearly I have to change the function, (particularly "value1") - but
how... I've also played around with "apply" but without any success.
Thanks for any help!
Patrick
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry Dept of Family/Preventive Medicine
cbe...@tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.