I have some new data that suggests there are issues inherent to pmap and possibly other parallelism with Clojure on older Intel quad+ core machines.
I added a noop loop to the benchmark. It looks like this: (defn noops [n] (when (> n 0) (recur (- n 1)))) Running those in parallel is also no faster on the Xeon 5150 box with four cores than it is with two. It has been suggested that memory contention is the problem with this machine. I suspect Clojure's overhead relative to Java is the reason that parallel Java benchmarks get more out of the four cores on this machine, but don't quote me on that. I had someone run the benchmarks on an 8-core Nehalem Mac Pro. Those results are quite a bit different from mine. On the true factorial benchmark, four threads are twice as fast as two. Eight threads are 50% faster than four, but 16 threads are about twice as fast as four. Intermediate numbers are a bit variable, but it seems like hyperthreading actually speeds things up quite a bit on this benchmark. ka's version of fac, which I've renamed spin-mult scales linearly with the number of physical cores, but slows down with between 9 and 15 threads. 16 threads is about equal to 8. I've put the benchmarks up on github: http://github.com/zakwilson/npmap I'm going to try changing spin-mult to use dotimes and see how that runs on several machines. Initial results on the Xeon 5150 box suggest that using dotimes instead of recur solves the problem, and I'll probably be changing the benchmarks to further explore the issue. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en