On 30/08/2009 6:08 PM, Noah Silverman wrote:
Hi,
I need a bit of guidance with the sapply function. I've read the help
page, but am still a bit unsure how to use it.
I have a large data frame with about 100 columns and 30,000 rows. One
of the columns is "group" of which there are about 2,000 distinct "groups".
I want to normalize (sum to 1) one of my variables per-group.
Normally, I would just write a huge "for each" loop, but have read that
is hugely inefficient with R.
Don't believe what you read, try it. If the for loop takes 100 times
longer than the fastest method, but it still only takes 10 seconds, is
it worth optimizing?
Duncan Murdoch
The old way would be (just an example, syntax might not be perfect):
for (group in data$group){
for (score in data[data$group == group]){
new_score <- score / sum(data$score[data$group==group])
}
}
How would I simplify this with sapply?
Thanks!
--
Noah
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.