On 02.07.2011 20:42, ivo welch wrote:
hi uwe--I did not know what snow was. from my 1 minute reading, it
seems like a much more involved setup that is much more flexible after
the setup cost has been incurred (specifically, allowing use of many
machines).
the attractiveness of the doMC/foreach framework is its simplicity of
installation and use.
but if I understand what you are telling me, you are using a different
parallelization framework, and it shows that my example is completed a
lot faster using this different parallelization framework. correct?
if so, the problem is my use of the doMC framework, not the inherent
cost of dealing with multiple processes. is this interpretation
correct?
Indeed.
Uwe
regards,
/iaw
----
Ivo Welch (ivo.we...@gmail.com)
http://www.ivo-welch.info/
2011/7/2 Uwe Ligges<lig...@statistik.tu-dortmund.de>:
On 02.07.2011 20:04, ivo welch wrote:
thank you, uwe. this is a little disappointing. parallel processing
for embarrassingly simple parallel operations--those needing no
communication---should be feasible if the thread is not always created
and released, but held. is there light-weight parallel processing
that could facilitate this?
Hmmm, now that you asked I checked it myself using snow:
On a some years old 2-core AMD64 machine with R-2.13.0 and snow (using SOCK
clsuters, i.e. slow communication) I get:
system.time(parSapply(cl, 1:A, function(i) uniroot(minfn, c(1e-20,9e20),
i)))
user system elapsed
3.10 0.19 51.43
while on a single core without parallelization framework:
system.time(sapply(1:A, function(i) uniroot(minfn, c(1e-20,9e20), i)))
user system elapsed
93.74 0.09 94.24
Hence (although my prior assumption was that the overhead would be big also
for other frameworks than foreach) it scales perfectly well with snow,
perhaps you have to use foreach in a different way?
Best,
Uwe Ligges
regards,
/iaw
2011/7/2 Uwe Ligges<lig...@statistik.tu-dortmund.de>:
On 02.07.2011 19:32, ivo welch wrote:
dear R experts---
I am experimenting with multicore processing, so far with pretty
disappointing results. Here is my simple example:
A<- 100000
randvalues<- abs(rnorm(A))
minfn<- function( x, i ) { log(abs(x))+x^3+i/A+randvalues[i] } ## an
arbitrary function
ARGV<- commandArgs(trailingOnly=TRUE)
if (ARGV[1] == "do-onecore") {
library(foreach)
discard<- foreach(i = 1:A) %do% uniroot( minfn, c(1e-20,9e20), i ) }
else
if (ARGV[1] == "do-multicore") {
library(doMC)
registerDoMC()
cat("You have", getDoParWorkers(), "cores\n")
discard<- foreach(i = 1:A) %dopar% uniroot( minfn, c(1e-20,9e20), i )
}
else
if (ARGV[1] == "plain")
for (i in 1:A) discard<- uniroot( minfn, c(1e-20,9e20), i ) else
cat("sorry, but argument", ARGV[1], "is not
plain|do-onecore|do-multicore\n")
on my Mac Pro 3,1 (2 quad-cores), R 2.12.0, which reports 8 cores,
"plain" takes about 68 seconds (real and user, using the unix timing
function).
"do-onecore" takes about 300 seconds.
"do-multicore" takes about 210 seconds real, (300 seconds user).
this seems pretty disappointing. the cores are not used for the most
part, either. feedback appreciated.
Feedback is that a single computation within your foreach loop is so
quick
that the overhead of communicating data and results between processes
costs
more time than the actual evaluation, hence you are faster with a single
process.
What you should do is:
write code that does, e.g., 10000 iterations within 10 other iterations
and
just do a foreach loop around the outer 10. Then you will probably be
much
faster (without testing). But this is essentially the example I am using
for
teaching to show when not to do parallel processing.....
Best,
Uwe Ligges
/iaw
----
Ivo Welch (ivo.we...@gmail.com)
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.