Hi all, I've switched many nested filter/map/mapcat applications in my code to using transducers. That brought a moderate speedup in certain cases and the deeper the nesting has been before, the clearer the transducers code is in comparison, so yay! :-)
However, I'm still quite unsure about the difference between `sequence` and `eduction`. From the docs and experimentation, I came to the assumptions below and I'd be grateful if someone with more knowledge could verify/falsify/add: - Return types differ: Sequence returns a standard lazy seq, eductions an instance of Eduction. - Eductions are reducible/sequable/iterable, i.e., basically I can use them wherever a (lazy) seq would also do, so sequence and eduction are quite interchangeable except when poking at internals, e.g., (.contains (sequence ...) x) works whereas (.contains (eduction ...) x) doesn't. - Both compute their contents lazily. - Lazy seqs cache their already realized contents, eductions compute them over and over again on each iteration. Because of that, I came to the conclusion that whenever I ask myself if one of my functions should return a lazy seq or an eduction, I should use these rules: 1. If the function is likely to be used like (let [xs (seq-producing-fn args)] (or (do-stuff-with xs) (do-other-stuff-with xs) ...)) that is, the resulting seq is likely to be bound to a variable which is then used multiple times (and thus lazy seq caching is benefitical), then use sequence. 2. If it is a private function only used internally and never with the usage pattern of point 1, then definitively use eduction. 3. If its a public function which usually isn't used with a pattern as in point 1, then I'm unsure. eduction is probably more efficient but sequence fits better in the original almost everything returns a lazy seq design. Also, the latter has the benefit that users of the library don't need to know anything about transducers. Is that sensible? Or am I completely wrong with my assumptions about sequence and eduction? On a related note, could someone please clarify the statement from the transducers docs for `sequence`? ,----[ Docs of sequence at http://clojure.org/transducers ] | The resulting sequence elements are incrementally computed. These | sequences will consume input incrementally as needed and fully realize | intermediate operations. This behavior differs from the equivalent | operations on lazy sequences. `---- I'm curious about the "fully realize intermediate operations" part. Does it mean that in a "traditional" (mapcat #(range %) (range 10000)) the inner range is also evaluated lazy but with (sequence (mapcat #(range %)) (range 10000)) it is not? It seems so. At least dorun-ning these two expressions shows that the "traditional" version is more than twice as fast than the transducer version. Also, the same seems to hold for (eduction (mapcat #(range %)) (range 10000)) which is exactly as fast (or rather slow) as the sequence version. But wouldn't that mean that transducers with mapcat where the mapcatted function isn't super-cheap is a bad idea in general at least from a performance POV? Bye, Tassilo -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.