Léo, I definitely agree that you can use unsynchronized mutable stateful transducers *as long as you can guarantee they'll be used only in single-threaded contexts. *We were talking up above on which version of synchronization is appropriate for which context. With core.async, if you're using a transducer on a `chan` or `pipeline` or the like, it is guaranteed that only one thread will use that at a time (thus `atom`s weren't needed), *but *a different thread might come in and reuse that same stateful transducer, in which case the result of that mutation will need to propagate to that thread via a `volatile`. With reducers `fold`, stateful transducers don't necessarily hold up their contract (e.g. with `map-indexed` as we discussed above) even if you use an `atom` or the like. But in truly single-threaded contexts, even within a `go` block or a `thread` or the like (as long as the transducer is not re-used e.g. on a `chan` etc. where the necessity for a `volatile` applies), it's certainly fine to use unsynchronized mutable stateful transducers.
On Monday, April 10, 2017 at 9:37:29 AM UTC-4, Léo Noel wrote: > > This topic is of high interest to me as it is at the core of my current > works. I had a similar questioning a while ago > <https://groups.google.com/forum/#!topic/clojure/2WtfyLG2Jls> and I have > to say I'm even more confused with this : > > While transducing processes may provide locking to cover the visibility of >> state updates in a stateful transducer, transducers should still use >> stateful constructs that ensure visibility (by using volatile, atoms, etc). >> > > I actually tried pretty hard to find a use case that would make > partition-all fail because of its unsynchronized local state, and did not > manage to find one that did not break any contract. I arrived at the > conclusion that it is always safe to use unsynchronized constructs in > stateful transducers. The reason is that you need to ensure that the result > of each step is given to the next, and doing so you will necessarily set a > memory barrier of some sort between each step. Each step happens-before the > next, and therefore mutations performed by the thread at step n are always > visible by the thread performing the step n+1. This is really brilliant : > when designing a transducer, you can be confident that calls to your > reducing function will be sequential and stop worrying about concurrency. > You just have to ensure that mutable state stays local. True encapsulation, > the broken promise of object-oriented programming. > > My point is that the transducer contract "always feed the result of step n > as the first argument of step n+1" is strong enough to safely use local > unsynchronized state. For this reason, switching partition-* transducers to > volatile constructs really sounds like a step backwards to me. However, > after re-reading the documentation on transducers, I found that this > contract is not explicitly stated. It is just *natural* to think this way, > because transducers are all about reducing processes. Is there a plan to > reconsider this principle ? I would be very interested to know what Rich > has in mind that could lead him to advise to overprotect local state of > transducers. > > > > On Monday, April 10, 2017 at 4:44:00 AM UTC+2, Alexander Gunnarson wrote: >> >> Thanks so much for your input Alex! It was a very helpful confirmation of >> the key conclusions arrived at in this thread, and I appreciate the >> additional elaborations you gave, especially the insight you passed on >> about the stateful transducers using `ArrayList`. I'm glad that I wasn't >> the only one wondering about the apparent lack of parity between its >> unsynchronized mutability and the volatile boxes used for e.g. >> `map-indexed` and others. >> >> As an aside about the stateful `take` transducer, Tesser uses the >> equivalent of one but skirts the issue by not guaranteeing that the first n >> items of the collection will be returned, but rather, n items of the >> collection in no particular order and starting at no particular index. This >> is achievable without Tesser by simply replacing the `volatile` in the >> `core/take` transducer with an `atom` and using it with `fold`. But yes, >> `take`'s contract is broken with this and so still follows the rule of >> thumb you established that `fold` can't use stateful transducers (at least, >> not without weird things like reordering of the indices in `map-indexed` >> and so on). >> >> That's interesting that `fold` can use transducers directly! I haven't >> tried that yet — I've just been wrapping them in an `r/folder`. >> >> On Sunday, April 9, 2017 at 10:22:13 PM UTC-4, Alex Miller wrote: >>> >>> Hey all, just catching up on this thread after the weekend. Rich and I >>> discussed the thread safety aspects of transducers last fall and the >>> intention is that transducers are expected to only be used in a single >>> thread at a time, but that thread can change throughout the life of the >>> transducing process (for example when a go block is passed over threads in >>> a pool in core.async). While transducing processes may provide locking to >>> cover the visibility of state updates in a stateful transducer, transducers >>> should still use stateful constructs that ensure visibility (by using >>> volatile, atoms, etc). >>> >>> The major transducing processes provided in core are transduce, into, >>> sequence, eduction, and core.async. All but core.async are single-threaded. >>> core.async channel transducers may occur on many threads due to interaction >>> with the go processing threads, but never happen on more than one thread at >>> a time. These operations are covered by the channel lock which should >>> guarantee visibility. Transducers used within a go block (via something >>> like transduce or into) occur eagerly and don't incur any switch in threads >>> so just fall back to the same old expectations of single-threaded use and >>> visibility. >>> >>> Note that there are a couple of stateful transducers that use ArrayList >>> (partition-by and partition-all). From my last conversation with Rich, he >>> said those should really be changed to protect themselves better with >>> volatile or something else. I thought I wrote up a ticket for this but >>> looks like maybe I didn't, so I will take care of that. >>> >>> Reducer fold is interesting in that each "bucket" is reduced via its >>> reduce function, which can actually use a transducer (since that produces a >>> reduce function), however, it can't be a stateful transducer (something >>> like take, etc). >>> >>> Hope that helps with respect to intent. >>> >>> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.