If you haven't seen it before, you can use the excellent Claypoole library <https://github.com/TheClimateCorporation/claypoole> for many parallel scheduling tasks. Alan
On Wed, Jan 2, 2019 at 1:07 PM <lawrence.krub...@gmail.com> wrote: > I guess this is more of a JVM question than a Clojure question, unless > Clojure exerts any special magic here. I'm open to a more Clojure approach > than what I have now. > > Someone suggested I use Executors/newScheduledThreadPool for some > recurring work, so I set it up like this: > > (def scheduler-aggregate > (Executors/newScheduledThreadPool 32)) > > at the start I call: > > (.scheduleAtFixedRate scheduler-aggregate ^Runnable (cycle-aggregate > to-database-queue) 1 30 TimeUnit/MINUTES) > > Aside from a try/catch block (which I just removed to simplify this > example) the inner function looks like this: > > (defn- cycle-aggregate > [to-database-queue] > (fn [] > (let [ > transcripts (query @global-database-connection {:item-type > :transcript :processed { operators/$exists false }}) > ] > (doseq [x transcripts] > (aggregate-words x) > (set-transcript-processed @global-database-connection x))) > > The function (aggregate-words) counts up a bunch of words, doing some prep > work for a later NLP engine, and then there is this line: > > (log "The end of aggregate-words."))) > > The whole process takes about 5 minutes to run, about 300 seconds. I watch > the database and I see the number of new records increase. About every 10 > seconds I see these words appear in the logs: > > "The end of aggregate-words." > > At the end of 5 minutes, these words have appeared 30 times, one for each > of the transcripts I'm importing. > > This seems like I've done something wrong? Since the words "The end of > aggregate-words." > appear at roughly equal intervals, and the transcripts are all about the > same size, it seems that all of the transcripts are being handled on one > thread. After all, if the 30 transcripts were handled on 30 threads, I'd > expect the 30 calls to aggregate-words would all end at roughly the same > time, instead of sequentially. > > What else do I need to do to parallelize this work? If I call (future) > inside of aggregate-words, would the new thread come from the pool? Is > there a way I can call aggregate-words and make sure it runs on its own > thread from the pool? > > > > > > > > > > > > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.