On Jul 19, 2011, at 4:18 PM, Peter Lomas wrote:

Hi Richard,

As others have said, try to use the "apply" functions rather than loops. There is also an apply function for lists, see ?lapply. This is much more
efficient.

Actually the "apply" functions are not "more efficient" in the usual meaning of time of execution. And sometimes they is rather inefficient. Prior discussions of this topic in the archives should be easy to find. The economy is in expression and the advantage is in code creation and maintenance.

Doubters of this proposition should consider these results:

library(rbenchmark) # help page has a more compact version of these tests

means.rep = function(n, m) {res1 <- vector(length=100, mode="numeric")
              res1 <- replicate(n, mean( rexp(m)))}
means.colMn = function(n, m) {res2 <- vector(length=100, mode="numeric")
               res2 <- colMeans(matrix( rexp(n*m), c(m, n)))}
means.tapply = function(n,m) {res3 <- vector(length=100, mode="numeric")
                 res3 <- tapply( rexp(n*m), rep(1:n, each = m), mean)}
means.apply =function(n,m) { res4 <- vector(length=100, mode="numeric")
                res4 <-apply( matrix(rexp(m*n),n,m), 1, mean) }
means.forloop =function(n, m) {res5 <- vector(length=100, mode="numeric")
                 for (i in n) {res5[i] <-mean(rexp(m))} }
benchmark(
   repl = means.rep(100, 100),
   tappl = means.tapply(100, 100),
   appl = means.apply(100, 100),
   pat = means.pat(100, 100),
   forloop =  means.forloop(100,100),
   replications=100, columns=c("test","replications","elapsed"),
   order='elapsed' )

###
Results:
     test replications relative elapsed
5 forloop          100     1.00   0.004
4     pat          100    20.25   0.081
1    repl          100    77.00   0.308
3    appl          100    89.75   0.359
2   tappl          100   264.50   1.058

I admit that I was rather surprised to see the for-loop beating colMeans by such a wide margin, and this is making me wonder if I reversed some index or coded the for-loop test wrong. So would appreciate some auditing and improvement of this test. (But I don't see how I could have reversed the order since the n and m are both 100. And I tried adding assignments to see if there were only promises being made with no calculations. The relative efficiencies stays the same.)

--
David.


 I also like writing my own functions.  For example:

f <- function(x) {
  x^2
}

Which can then be used by:
f(2)
[1] 4

This is very useful if you're getting into maximum likelihood programming,
or want to use the "optim" function (for multivariate functions) or
"optimize" (for univariate functions).

Lastly, check out the R reference card.
http://cran.r-project.org/doc/contrib/Short-refcard.pdf

Regards,
Peter

On Tue, Jul 19, 2011 at 12:43, RichardLang <l...@zedat.fu-berlin.de> wrote:

Hi everyone!

I'm trying to teach myself R in order to do some data analysis. I'm a
mathematics student and (only) familiar with matlab and latex. I'm working trough the "official" introduction to R at the moment, while simultaneously
solving some exercises I found in the web. Before I post my (probably
stupid) question, I'd like to ask you for some general advice. How do you work with R? Is it like in matlab, that you write your functions with a lot of loops etc. in a textfile and then run it? Or do you just prepare your data and then use the functions provided by R (plot, mean etc) to get some
analysis? I'd be very thankfull for some of your thoughts about
"approaches".

Now the question: I'm trying to build a vector with n entries, each
consisting of the mean of m random numbers (exponential distributed for example). My approach was to construct a nxm random matrix and then to somehow take the mean of each row. But in the mean function there is no parameter to do this, so the intended approach of R is probably different..
any ideas? =)

Richard



David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to