If you haven't seen it before, you can use the excellent Claypoole library
<https://github.com/TheClimateCorporation/claypoole> for many parallel
scheduling tasks.
Alan

On Wed, Jan 2, 2019 at 1:07 PM <lawrence.krub...@gmail.com> wrote:

> I guess this is more of a JVM question than a Clojure question, unless
> Clojure exerts any special magic here. I'm open to a more Clojure approach
> than what I have now.
>
> Someone suggested I use Executors/newScheduledThreadPool for some
> recurring work, so I set it up like this:
>
> (def scheduler-aggregate
>   (Executors/newScheduledThreadPool 32))
>
> at the start I call:
>
>   (.scheduleAtFixedRate  scheduler-aggregate ^Runnable (cycle-aggregate
> to-database-queue) 1 30 TimeUnit/MINUTES)
>
> Aside from a try/catch block (which I just removed to simplify this
> example) the inner function looks like this:
>
> (defn- cycle-aggregate
>   [to-database-queue]
>   (fn []
>      (let [
>            transcripts (query @global-database-connection {:item-type
> :transcript :processed { operators/$exists false }})
>            ]
>        (doseq [x transcripts]
>          (aggregate-words x)
>          (set-transcript-processed  @global-database-connection x)))
>
> The function (aggregate-words) counts up a bunch of words, doing some prep
> work for a later NLP engine, and then there is this line:
>
>     (log "The end of aggregate-words.")))
>
> The whole process takes about 5 minutes to run, about 300 seconds. I watch
> the database and I see the number of new records increase. About every 10
> seconds I see these words appear in the logs:
>
> "The end of aggregate-words."
>
> At the end of 5 minutes, these words have appeared 30 times, one for each
> of the transcripts I'm importing.
>
> This seems like I've done something wrong? Since the words "The end of
> aggregate-words."
> appear at roughly equal intervals, and the transcripts are all about the
> same size, it seems that all of the transcripts are being handled on one
> thread. After all, if the 30 transcripts were handled on 30 threads, I'd
> expect the 30 calls to aggregate-words would all end at roughly the same
> time, instead of sequentially.
>
> What else do I need to do to parallelize this work? If I call (future)
> inside of aggregate-words, would the new thread come from the pool? Is
> there a way I can call aggregate-words and make sure it runs on its own
> thread from the pool?
>
>
>
>
>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to