Keeping the discussion here would make sense, esp. in light of meetup.com's 
horrible "discussion board".

I don't have a lot to offer on the JVM/Clojure-specific problem beyond what I 
wrote in that meetup thread, but Lee's challenge(s) were too hard to resist:

> "Would your conclusion be something like 'Intensive list processing can't 
> currently be done in Java (or Clojure/JVM) in a way that takes reasonable 
> advantage of multiple cores.'?"


> "I see that "rewrite without lists" might be the only way out, but that'd be 
> a bummer. Clojure is a multicore Lisp; if you can't use it for multicore list 
> processing then... sigh."

The nature of the `burn` program is such that I'm skeptical of the ability of 
any garbage-collected runtime (lispy or not) to scale its operation across 
multiple threads.  It generates a huge amount of garbage that is held for a 
very long time (in GC terms), thus producing a great deal of contention on the 
heap between the active job threads and the GC implementation constantly having 
to clean up after them, compact the results, etc.  The workload is not 
CPU-bound, so I don't see it being parallelized effectively.

I'd be very interested in seeing an implementation for a runtime that proves me 
wrong, i.e. can attain significant performance benefits by running the e.g. 8 
`burn` "jobs" across multiple threads.

In an attempt to do just that (prove myself wrong), I thought I'd golf the 
`burn` program (again; onlookers can go look at the Java implementation I 
knocked out a few days ago here: https://gist.github.com/4682554) on Racket, 
presumably as well-suited to "multicore list processing" as any other.  The 
results:

        https://gist.github.com/4682453

Please forgive my perhaps pitiful Racket-fu.  I'm sure there's threadpool and 
CAS libraries available, but it was entertaining to hack away with the 
threading and semaphore primitives.  If I've goofed something up badly (it's 
been some time since I've schemed), please speak up.

The Racket and Java implementations are effectively equivalent:

`java -server -Xmx2g cemerick.burn $THREAD_COUNT`  (oracle java 1.7.0_09)
1 thread: 61s
2 threads: 126s

`java -server -Xmx2g cemerick.burn $THREAD_COUNT`  (oracle java 1.7.0_09)
1 thread: 47s
2 threads: 92s

`time ./racket -t ~/burn.scm -m $THREAD_COUNT`  (racket 5.3.1)
1 thread: 45s
2 threads: 126s

The above was run on OS X 10.7.5 with 4GB of physical memory.  If someone knows 
of ways to get better perf on racket, let me know.  I tried running from the 
results of `raco make` and such, but saw no improvements; I guess it is 
strictly producing bytecode that is later JIT'ed, and not doing any kind of 
additional AOT optimizations...

It'd be interesting to see timings from different operating systems, and very 
interesting to see timings from running burn.scm on other schemes, or a Common 
Lisp implementation and timings.

Cheers,

- Chas

On Jan 30, 2013, at 9:20 PM, Lee Spector wrote:

> 
> FYI we had a bit of a discussion about this at a meetup in Amherst MA 
> yesterday, and while I'm not sufficiently on top of the JVM or system issues 
> to have briefed everyone on all of the details there has been a little of 
> followup since the discussion, including results of some different 
> experiments by Chas Emerick, at: 
> http://www.meetup.com/Functional-Programming-Connoisseurs/messages/boards/thread/30946382
> 
> -Lee
> 
> On Jan 30, 2013, at 8:39 PM, Marshall Bockrath-Vandegrift wrote:
>> 
>> Apologies for my very-slow reply here.  I keep thinking that I’ll have
>> more time to look into this issue, and keep having other things
>> requiring my attention.  And on top of that, I’ve temporarily lost the
>> many-way AMD system I was using as a test-bed.
>> 
>> I very much want to see if I can get my hands on an Intel system to
>> compare to.  My AMD system is in theory 32-way – two physical CPUs, each
>> with 16 cores.  However, Linux reports (via /proc/cpuinfo) the cores in
>> groups of 8 (“cpu cores : 8” etc).  And something very strange happens
>> when extending parallelism beyond 8-way...  I ran several experiments
>> using a version of your whole-application benchmark I modified to
>> control the level of parallelism.  At parallelism 9+, the real time it
>> takes to complete the benchmark hardly budges, but the user/CPU time
>> increases linearly with the level of parallelism!  As far as I can tell,
>> multi-processor AMD *is* a NUMA architecture, which might potentially
>> explain things.  But enabling the JVM NUMA options doesn’t seem to
>> affect the benchmark.
>> 
>> I think next steps are two-fold: (1) examine parallelism vs real & CPU
>> time on an Intel system, and (2) attempt to reproduce the observed
>> behavior in pure Java.  I’m keeping my fingers crossed that I’ll have
>> some time to look at this more soon, but I’m honestly not very hopeful.
>> 
>> In the mean time, I hope you’ve managed to exploit multi-process
>> parallelism to run more efficiently?
>> 
>> -Marshall
> 
> -- 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to