Re: partition & fold

Renzo Borgatti Tue, 29 Jan 2019 10:08:42 -0800

Hi Brian,

I think what you’re searching for is:


(require ‘[clojure.core.reducers :as r])
(r/fold concat ((comp (partition-all 5) (map #(apply + %))) conj) (into [] 
(range 1000)))

However, the result of the above is undetermined (results are different every 
time you call). This is because you have a stateful transducer in the mix, 
which is not parallelizable with fold. One option is to use this lib 
https://github.com/reborg/parallel#pfold-pxrf-and-pfolder (I’m the author) and 
write:

(require ‘[parallel.core :as p])
(p/fold concat (p/xrf conj (partition-all 5) (map #(apply + %))) (into [] 
(range 1000)))

Which does what you expect in parallel and consistently. More info about your 
attempt below.


> On 26 Jan 2019, at 23:03, Brian Craft <craft.br...@gmail.com> wrote:
> 
> Still trying to understand reducers & transducers. Is the difference between 
> r/folder and r/reducer the parallelization?
> 
> Anyone know what this error is about?
> 
> (r/foldcat (r/map #(apply + %) (r/folder (into [] (range 10000000)) 
> (partition-all 5))))
> 
> ArrayIndexOutOfBoundsException 47427  java.util.ArrayList.add 
> (ArrayList.java:459)

> 
> Is this error to do with partition-all being stateful?

Correct. At some point the internal state of the transducer (where the 5 
elements are accumulated in an ArrayList) gets out of sync. ArrayList is not 
thread safe.

> This works as expected, with reducer instead of folder:
> 
> (r/foldcat (r/map #(apply + %) (r/reducer (into [] (range 10000000)) 
> (partition-all 5))))  ; XXX large output

Because it’s not parallel. r/reducer creates a reducible-transformer but 
non-foldable collection.

> Is there some way to express "map over partitions, in parallel, without 
> creating an intermediate collection" with reducers and transducers?

There are no interemediate collections created on each parallel chunk. However 
those chunks are concatenated back with concat to return the final result and 
those intermediate collections are “wasted”. There are ways to concat into a 
concurrent mutable collection with good performance speedup. I’ve used that in 
several functions in the parallel lib.

Renzo

> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: partition & fold

Reply via email to