pnmath currently uses up to 8 threads (i.e. 1, 2, 4, or 8).
getNumPnmathThreads() should tell you the maximum number used on your
system, which should be 8 if the number of processors is being
identified correctly. With the size of m this calculation should be
using 8 threads, but the exp calculation is fairly fast, so the
overhead is noticable. On a Linux box with 4 dual-core AMD processors
I get
m <- matrix(0, 10000, 1000)
mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.3859
library(pnmath)
mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.0775
A similar example using qbeta, a slower function, gives
p <- matrix(0.5,1000,1000)
setNumPnmathThreads(1)
[1] 1
mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 7.334
setNumPnmathThreads(8)
[1] 8
mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 0.9576
On an 8-core Intel/OS X box the improvement for exp is much less, but
is similar for qbeta.
luke
On Thu, 10 Jul 2008, Martin Morgan wrote:
"Juan Pablo Romero Méndez" <[EMAIL PROTECTED]> writes:
Just out of curiosity, what system do you have?
These are the results in my machine:
system.time(exp(m), gcFirst=TRUE)
user system elapsed
0.52 0.04 0.56
library(pnmath)
system.time(exp(m), gcFirst=TRUE)
user system elapsed
0.660 0.016 0.175
from cat /proc/cpuinfo, the original results were from a 32 bit
dual-core system
model name : Intel(R) Core(TM)2 CPU T7600 @ 2.33GHz
Here's a second set of results on a 64-bit system with 16 core (4 core
on 4 physical processors, I think)
mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.165
mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.0397
model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz
One thing is that for me in single-thread mode the faster processor
actually evaluates slower. This could be because of 64-bit issues,
other hardware design aspects, the way I've compiled R on the two
platforms, or other system activities on the larger machine.
A second thing is that it appears that the larger machine only
accelerates 4-fold, rather than a naive 16-fold; I think this is from
decisions in the pnmath code about the number of processors to use,
although I'm not sure.
A final thing is that running intensive tests on my laptop generates
enough extra heat to increase the fan speed and laptop temperature. I
sort of wonder whether consumer laptops / desktops are engineered for
sustained use of their multiple core (although I guess the gaming
community makes heavy use of multiple cores).
Martin
Juan Pablo
system.time(exp(m), gcFirst=TRUE)
user system elapsed
0.108 0.000 0.106
library(pnmath)
system.time(exp(m), gcFirst=TRUE)
user system elapsed
0.096 0.004 0.052
(elapsed time about 2x faster). Both BLAS and pnmath make much better
use of resources, since they do not require multiple R instances.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: [EMAIL PROTECTED]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.