Other posts to the thread indicate that longer-range patterns in the
inputs could cause problems. If you know you'll be consuming the full
sequence, try this:

(defn eager-pmap [f & colls]
  (map deref (doall (apply map #(future (f %)) colls))))

This creates all of the futures right away (due to the doall) and
leaves it up to the future implementation to distribute work around
some thread pool. On my system, it creates 80 or so of them each time
it's run on a seq of 100000 things, which persist for some time
afterward -- so, a bit of a problem there. Another one that seems like
it ought to be more efficient:

(defn eager-pmap [f & colls]
  (let [cores (.. Runtime getRuntime availableProcessors)
        agents (cycle (for [_ (range cores)] (agent nil)))
        promises (apply map (fn [& _] (promise)) colls)]
    (doall
      (apply map (fn [a p & args]
                   (send a
                     (fn [_] (deliver p (apply f args)))))
        agents promises colls))
    (map deref promises)))

This one uses the agent "send" thread pool to divide up the work.
Oddly, it's actually 2-3x slower than the previous and only uses 75%
CPU on a dual-core machine, though it doesn't leak threads. It looks
like agents, or promise/deliver, or both have lots of overhead
compared to futures. (Which is odd since the obvious implementation of
future is (defmacro future [& body] `(let [p# (promise)] (.start
(Thread. (fn [] (deliver p# (do ~@body))))) p#)). So maybe it's agents
that are causing the inefficiency? On the other hand, the result of
evaluating (future foo) doesn't appear to be a naked promise, though
it could still be a wrapped one, and that implementation would not
limit itself to 80-odd threads; maybe future instead uses the agent
"send-off" pool?!)

I'd recommend just using the first version of eager-pmap in this email
if you need an eager-pmap. The extra threads it creates are not too
numerous (it doesn't create 100,000 threads for an input seq of
100,000 items) and they *do* seem to disappear again *eventually* (it
may take several minutes; it takes much less than 1 hour; it takes
more than 30 seconds; can't be more precise with the data I've
gathered thus far).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to