Sorry. *Something* is apparently messing with my outbound messages. I'm not sure what, why, or how. Characters are moved or substituted at random times, sometimes with unfortunate results.
Meanwhile, I had tried using pcalls as well but was getting spurious behavior. I wound up with: (defmacro pdoseq "Bindings as for for, but parallel execution as per pmap, pcalls, pvalues; returns nil." [seq-exprs & body] `(let [procs# (+ 2 (.availableProcessors (Runtime/getRuntime))) calls# (for ~seq-exprs (fn [] ~@body)) threads# (for [i# (range procs#)] (Thread. #(doall (map (fn [x#] (x#)) (take-nth procs# (drop i# calls#))))))] (doseq [t# threads#] (.start t#)) (doseq [t# threads#] (.join t#)))) The body is wrapped in a function so realizing an element of the seq doesn't immediately execute it. Threads are created that will skip to every nth element and invoke the function there, each starting offset from the others. The threads are then all started, and then all joined so the pdoseq call doesn't return until the job's done, just as plain doseq has a synchronous return. One potential issue is that if one thread gets well ahead of the others a big chunk of the for seq is held onto while the others catch up. So it's suitable for up to thousands of items that are individually expensive. If you had millions of individually cheap items you'd need another strategy. A nested iteration like (doseq [x s1 y s2] foo) could be split into (pdoseq [x s1] (doseq [y s2] foo)) so that whole inner loops are the granularity of the parallel jobs (diluting each anonymous function call overhead over a larger portion of the total work) and the size of the potentially-held-onto seq is only the size of the outer loop (e.g. the maximum size of held-onto seq would be 1024 instead of 786432 if you are looping over a 1024x768 array, easily possible with some image manipulation jobs, and there wouldn't be an added function call overhead once every pixel but only once every row.) An interesting question is why something like this isn't already in the standard library. The supplied parallel functions (pmap, pcalls, pvalues) don't have for bindings and don't seem to be adaptable to use them. Another thing that I thought could be handy is a lazy vector -- each element is realized only when retrieved. Backed by a vector of delays, of course. And super-lazy seqs and vectors built on a "weak delay": something like delay, except that it uses a WeakReference to cache its value. If it goes away the code that computed it will run again (so it better not have side effects) if it's needed again. Lazy vectors could even be backed by a function with a single integer argument -- it seems unlikely you could define one in another way than as some sort of mapping from index to value-to-produce anyway. There'd need to be some kind of tree to hold the cache as well, perhaps one suited to represent sparse vectors for good performance and low memory use when few elements were realized. The usual 32^n trees, but with some branches omitted with nils, might do the job; as it filled in when elements were realized it would turn into a normal vector as far as memory use was concerned. In the super-lazy case a (mutable!) map with weak values (there's one in the Apache Commons) might be preferable to a tree holding individual weak delays. How are lazy vectors and super-lazy things relevant to this thread? Well, the above already implements something like a super-lazy seq in that a compact closure stands in for each realized item and must be called repeatedly to produce the true seq element. The only thing missing is the weak caching. And a super-lazy vector would be especially easy to use in parallelized manner because it's random access. The above code that produces offsets i# and stepsizes procs# would, instead of being fed to (take-nth procs# (drop i# aseq)) to get functions to invoke, would be fed to (map lvec (range i# (count lvec) procs#)) to get the actual elements. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en