On Thu, 2008-07-24 at 02:25 +0530, Arun Kumar Saha wrote: > There is a R function to calculate weighted mean : weighted.mean() under > stats package. Is there any direct R function for calculating weighted > variance as well?
Here are two ways; weighted.var() is via the usual formula and weighted.var2() uses a running sums approach. The formulae for which are both on the weighted mean entry page on wikipedia for example. The removal of NA is as per weighted.mean, but I have not included any of the sanity checks that that functions contains. weighted.var <- function(x, w, na.rm = FALSE) { if (na.rm) { w <- w[i <- !is.na(x)] x <- x[i] } sum.w <- sum(w) sum.w2 <- sum(w^2) mean.w <- sum(x * w) / sum(w) (sum.w / (sum.w^2 - sum.w2)) * sum(w * (x - mean.w)^2, na.rm = na.rm) } weighted.var2 <- function(x, w, na.rm = FALSE) { if (na.rm) { w <- w[i <- !is.na(x)] x <- x[i] } sum.w <- sum(w) (sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)) } ## from example section in ?weighted.mean ## GPA from Siegel 1994 wt <- c(5, 5, 4, 1)/15 x <- c(3.7,3.3,3.5,2.8) weighted.mean(x,wt) weighted.var(x, wt) weighted.var2(x, wt) And some timings: > system.time(replicate(100000, weighted.var(x, wt))) user system elapsed 2.679 0.014 2.820 > system.time(replicate(100000, weighted.var2(x, wt))) user system elapsed 2.224 0.010 2.315 HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.