Please forgive the lengthy post... New to Clojure and trying to get a friend excited about it by demonstrating how Clojure's functional/STM heritage makes it easy to write concurrent/parallel code, but I can't get Clojure to
I started with Java, where 256 threads are vying to increment a single counter 10,000 times each: public class Concurrent { static final int NUM_THREADS = 256; static final int NUM_CALLS = 10000; volatile boolean started = false; volatile int counter = 0; synchronized void incrementCounter() { counter += 1; } public static void main(final String[] args) throws Exception { final Concurrent c = new Concurrent(); final Thread[] threads = new Thread[NUM_THREADS]; for (int i = 0; i < NUM_THREADS; i++) { threads[i] = new Thread(new Runnable() { public void run() { while (!c.started); for (int i = 0; i < NUM_CALLS; i++) c.incrementCounter(); } }); threads[i].start(); } final long start = System.currentTimeMillis(); c.started = true; for (int i = 0; i < NUM_THREADS; i++) threads[i].join(); System.out.println("counter = "+c.counter+" and took "+(System.currentTimeMillis() - start)+" ms."); System.exit(0); } } You may wonder why such a high thread/increment count. I needed numbers this substantial because Java and my W510 are simply that efficient at handling it: $ time /opt/jdk1.6/bin/java Concurrent counter = 2560000 and took 481 ms. real 0m12.782s user 1m36.286s sys 0m0.251s (Most of the "real" time is spent creating and starting the threads. In fact, the purpose of "started" is because it takes less time to count 10,000 times than to create a new thread, so I wanted to force all 256 threads to "hold" until I could "start" them all simultaneously.) This is a trivial example, but illustrates the practice of combining volatile fields with synchronized methods... I wish that looked "harder", because I'm trying to sell Clojure here! In any case, the following is the first Clojure program I've ever written to "demonstrate" how easily STM tackles this: (def NUM_THREADS 256) (def NUM_CALLS 10000) (def started (ref false)) (def counter (ref 0)) (defn increment-counter [] (dosync (alter counter inc))) (def threads (map #(Thread. %) (repeat NUM_THREADS #(do (while (not @started)) (dotimes [_ NUM_CALLS] (increment-counter)))))) (map #(.start %) threads) (let [start (System/currentTimeMillis)] (dosync (ref-set started true)) (map #(.join %) threads) (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start)))) The only problem is that this DOESN'T WORK and I don't know why (I'll save a humorous tangent about the effects of forgetting the "@" in "(while (not @started))"). This is best illustrated by running it: $ time /opt/jdk1.6/bin/java -jar ~/dev/lang/clojure-1.2.0/clojure.jar java_like.clj counter = 0 and took 1 ms. real 0m1.059s user 0m1.280s sys 0m0.058s As you can see, it very quickly did nothing -- ignoring my "(map #(.join %) threads)". Running the same code in the REPL produces interesting results. The REPL, too, will print instantly and prompt me for more work. However, it's clearly executing the threads, as repeated "@counter" invocations show an incrementing counter. And it is absolutely dog slow: my 1/2 seconds of spinning in native Java takes an eternal 2 1/2 minutes in Clojure. I suspect the culprit is my "let": getting rid of it causes "(map #(.join %) threads)" to behave as expected. But, I don't get it. Maybe "let" is executing in another thread I don't know about and it is _that_ thread that is waiting to join. Thoughts? Meanwhile, this probably isn't what I want, anyway. I should be using Clojure's own mechanisms for parallelism: pmap. My first attempt failed miserably: (def NUM_THREADS 256) (def NUM_CALLS 10000) (def TOTAL (* NUM_THREADS NUM_CALLS)) (def counter (ref 0)) (defn increment-counter [_] (dosync (alter counter inc))) (let [start (System/currentTimeMillis)] (pmap increment-counter (range TOTAL)) (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start)))) This suffers the same problem as my first attempt: it prints before it completes. The problem is that it once it completes "@counter" returns 32 (I believe that's the size of Clojure's thread pool). I've not figured out how to force Clojure to actually get beyond this number (I thought maybe some element of laziness was coming into play here, but, if so, I've not figured out how to de-lazify it)... Interestingly, my second attempt fared better: (def NUM_THREADS 256) (def NUM_CALLS 10000) (def TOTAL (* NUM_THREADS NUM_CALLS)) (def counter (ref 0)) (defn increment-counter [] (dosync (alter counter inc))) (let [start (System/currentTimeMillis)] (apply pcalls (repeat TOTAL increment-counter)) (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start)))) Still the same problem with the printing. However, this is the only Clojure program I can come up with that'll actually run to completion (in and outside of the REPL): $ time /opt/jdk1.6/bin/java -jar clojure.jar ../../play/clojure_threading/clj_like.clj counter = 2 and took 8 ms. real 1m0.965s user 0m1.184s sys 0m0.069s Still slow, though, but not as bad. Not quite apples-to-apples, though. (Something I'll need to ponder: if everything were functional and transactional, couldn't the language automatically handle parallelism?) What am i missing? Why is this printing prior to join/pmap/pcalls? Why won't pmap keep going until I reach TOTAL? -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en