I guess this is more of a JVM question than a Clojure question, unless 
Clojure exerts any special magic here. I'm open to a more Clojure approach 
than what I have now. 

Someone suggested I use Executors/newScheduledThreadPool for some recurring 
work, so I set it up like this: 

(def scheduler-aggregate
  (Executors/newScheduledThreadPool 32))

at the start I call:

  (.scheduleAtFixedRate  scheduler-aggregate ^Runnable (cycle-aggregate 
to-database-queue) 1 30 TimeUnit/MINUTES)

Aside from a try/catch block (which I just removed to simplify this 
example) the inner function looks like this:

(defn- cycle-aggregate
  [to-database-queue]
  (fn []
     (let [
           transcripts (query @global-database-connection {:item-type 
:transcript :processed { operators/$exists false }})
           ]
       (doseq [x transcripts]
         (aggregate-words x)
         (set-transcript-processed  @global-database-connection x)))

The function (aggregate-words) counts up a bunch of words, doing some prep 
work for a later NLP engine, and then there is this line: 

    (log "The end of aggregate-words.")))

The whole process takes about 5 minutes to run, about 300 seconds. I watch 
the database and I see the number of new records increase. About every 10 
seconds I see these words appear in the logs: 

"The end of aggregate-words."

At the end of 5 minutes, these words have appeared 30 times, one for each 
of the transcripts I'm importing. 

This seems like I've done something wrong? Since the words "The end of 
aggregate-words."
appear at roughly equal intervals, and the transcripts are all about the 
same size, it seems that all of the transcripts are being handled on one 
thread. After all, if the 30 transcripts were handled on 30 threads, I'd 
expect the 30 calls to aggregate-words would all end at roughly the same 
time, instead of sequentially. 

What else do I need to do to parallelize this work? If I call (future) 
inside of aggregate-words, would the new thread come from the pool? Is 
there a way I can call aggregate-words and make sure it runs on its own 
thread from the pool? 












-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to