On 30/08/2009 6:08 PM, Noah Silverman wrote:
Hi,

I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it.

I have a large data frame with about 100 columns and 30,000 rows. One of the columns is "group" of which there are about 2,000 distinct "groups".

I want to normalize (sum to 1) one of my variables per-group.

Normally, I would just write a huge "for each" loop, but have read that is hugely inefficient with R.

Don't believe what you read, try it. If the for loop takes 100 times longer than the fastest method, but it still only takes 10 seconds, is it worth optimizing?

Duncan Murdoch


The old way would be (just an example, syntax might not be perfect):

for (group in data$group){
     for (score in data[data$group == group]){
         new_score <- score / sum(data$score[data$group==group])
     }
}

How would I simplify this with sapply?

Thanks!

--
Noah

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to