I nearly suggested that but it sounded so counter-intuitive and I
didn't have time to construct a test-bed for it... glad you figured it
out. That means that your float-seq has to be fully realized in memory
tho', right?

On Mon, Oct 14, 2013 at 9:58 PM, Brian Craft <craft.br...@gmail.com> wrote:
> Answering my own question, my grouping example failed because of the
> laziness of the expressions being computed in the pmap: the threads were not
> evaluating the expressions until they were evaluated (sequentially) by the
> main thread. Adding a doall gives the result I was hoping for.
>
>
> On Sunday, October 13, 2013 4:10:16 PM UTC-7, Brian Craft wrote:
>>
>> I'm walking a seq of many millions of floats, encoding them for the
>> persistence layer, and getting sequence ids from the db. So, conceptually
>> there are two parts: the slow part, and the side-effecting part. Vaguely
>> like
>>
>> (map get-ids (map encode float-seq))
>>
>> which is later reduced while writing to disk. In the get-ids step the
>> order matters. My first attempt to make the slow part parallel was to use
>> pmap,
>>
>> (map get-ids (pmap encode float-seq))
>>
>> However that's actually slower. I expect this is because even though
>> "encode" is the bottleneck, it's still faster than the overhead of pmap. I
>> next tried pmap over groups of floats, a bit like
>>
>> (map get-ids (flatten (pmap #(map encode %) (partition-all 20000
>> float-seq))))
>>
>> (sorry for any typos, I'm just pseudo-coding here) This was still slower,
>> which surprised me. I understand the first pmap result, but this one is
>> puzzling to me. Even if I partition half the length of the seq (so in theory
>> it can run two threads, each of which will run five or six seconds), it's no
>> faster than map.  Part of this seems to be the overhead of creating more
>> intermediate seqs. Perhaps I'm misunderstanding what's happening during
>> partition-all.
>>
>> Is there some obvious way to approach this scenario? I looked briefly at
>> the reducers library, however it was unclear to me how to deal with the
>> side-effecting portion of the operation. The second (fast) map operation
>> needs to be done in order.
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



-- 
Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/
World Singles, LLC. -- http://worldsingles.com/

"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to