On Aug 5, 6:09 am, Rich Hickey <richhic...@gmail.com> wrote:
> On Wed, Aug 5, 2009 at 8:29 AM, Johann Kraus<johann.kr...@gmail.com> wrote:
>
> >> Could it be that your CPU has a single floating-point unit shared by 4
> >> cores on a single die, and thus only 2 floating-point units total for
> >> all 8 of your cores?  If so, then that fact, plus the fact that each
> >> core has its own separate ALU for integer operations, would seem to
> >> explain the results you are seeing.
>
> > Exactly, this would explain the behaviour. But unfortunately it is not
> > the case. I implemented a small example using Java (Java Threads) and
> > C (PThreads) and both times I get a linear speedup. See the attached
> > code below. The cores only share 12 MB cache, but this should be
> > enough memory for my micro-benchmark. Seeing the linear speedup in
> > Java and C, I would negate a hardware limitation.
>
> > _
> > Johann
>
>
> I looked briefly at your problem and don't see anything right off the
> bat. Do you have a profiler and could you try that out? I'm
> interested.
> Rich

I ran these tests on my iMac with 2.16 GHz Intel Core 2 Duo (2 cores)
using latest Clojure and clojure-contrib from git as of some time on
Aug 4, 2009.  The Java implementation is from Apple, version 1.6.0_13.

----------------------------------------------------------------------
For int, there are 64 "jobs" run, each of which consists of doing
(inc 0) 1,000,000,000 times.  See pmap-batch.sh and pmap-testing.clj
for details.

http://github.com/jafingerhut/clojure-benchmarks/blob/398688c71525964ba4c2d55d0e487b7618efdc8b/misc/pmap-batch.sh

http://github.com/jafingerhut/clojure-benchmarks/blob/398688c71525964ba4c2d55d0e487b7618efdc8b/misc/pmap-testing.clj

Yes, yes, I know.  I should really use a library for command line
argument parsing to avoid so much repetitive code.  I may do that some
day.


Results for int 1 thread - jobs run sequentially

"Elapsed time: 267547.789 msecs"
real       269.22
user       268.61
sys          1.79

int 2 threads - jobs run in 2 threads using modified-pmap, which
limits the number of futures causing threads to run jobs to be at most
2 at a time.

"Elapsed time: 177428.626 msecs"
real       179.14
user       330.30
sys         15.46

Comment: Elapsed time with 2 threads is about 2/3 of elapsed time with
1 thread.  Not as good as the 1/2 as we'd like with a 2 core machine,
but better than not being faster at all.

----------------------------------------------------------------------
For double, there are 16 "jobs" run, each of which consists of doing
(inc 0.1) 1,000,000,000 times.

double 1 thread

"Elapsed time: 258659.424 msecs"
real       263.28
user       247.29
sys         12.17

double 2 threads

"Elapsed time: 229382.68 msecs"
Dumping CPU usage by sampling running threads ... done.
real       231.05
user       380.79
sys         11.49

Comment: Elapsed time with 2 threads is about 7/8 of elapsed time with
1 thread.  Hardly any improvement at all for something that should be
"embarrassingly parallel", and the user time reported by Mac OS X's
/usr/bin/time increased by a factor of about 1.5.  That seems like way
too much overhead for thread coordination.


Here are hprof output files for the "double 1 thread" and "double 2
threads" tests:

http://github.com/jafingerhut/clojure-benchmarks/blob/51d499c2679c2d5ed42a65adcef6d9e8bc3e1aad/misc/pmap-batch-output/pmap-double-1-hprof.txt

http://github.com/jafingerhut/clojure-benchmarks/blob/51d499c2679c2d5ed42a65adcef6d9e8bc3e1aad/misc/pmap-batch-output/pmap-double-2-hprof.txt

In both cases, over 98% of the time is spent in
java.lang.Double.valueOf(double d).  See the files for the full stack
backtraces if you are curious.

I don't see any reason why that method should have any kind of
contention or worse performance when running on 2 cores vs. 1 core,
but I don't know the guts of how it is implemented.  At least in
OpenJDK all it does is "return new Double(d)", where d is the double
arg to valueOf().  Is there any reason why "new" might exhibit
contention between parallel threads?

Andy

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to