Hi Viktor

Thanks for bearing with me.

> [...] are you suggesting that we add a section in the Gatherer (and 
> Collector) in the form of "Implementor Notes" that are a bit more high-level 
> than the specification?

Yes, exactly, explicitly mentioning reusability (I actually read through the 
Javadoc at the time, but didn't connect the dots) and providing 
guidance/recommending alternatives for use cases where one might be tempted to 
write a non-reusable Gatherer (I think these use cases can be summarized as: 
combining a Stream with another Stream, where both Streams might be infinite).

Kind regards, Anthony

September 23, 2024 at 5:19 PM, "Viktor Klang" <viktor.kl...@oracle.com> wrote:
> 
> Hi Anthony,
> 
> >The idea is to collect feedback, to see how many people report their 
> >Gatherers being broken (i.e. their Gatherers being non-compliant without 
> >realizing it), so enforcing it in `Stream::gather` is sufficient for this 
> >purpose.
> 
> Even if this is well-intentioned, my experience tells me that this feedback 
> will not materialize, and trying to provoke conformance at runtime will have 
> a noticeable performance impact not encumbering other intermediate 
> operations, especially for processing the bulk majority of streams (which 
> tend to be less than 10 elements in size).
> 
> >This hasn't come up before because it requires people (a) to read the 
> >Javadoc, (b) to connect the dots and conclude "thus, a Gatherer must be 
> >reusable", and (c) to be willing to invest their time in asking the 
> >question, rather than moving on since their Gatherers "just work".
> 
> Adding clarifications to the Javadoc may be the most balanced path forward, 
> in doing so we're talking updating the documentation for both Gatherer and 
> Collector.
> 
> >Not sure I understand this argument? I'd argue that increasing those odds 
> >would be done by allowing an additional category of Gatherers, not by 
> >prohibiting it? 
> 
> No, that would be "moving the goalposts" i.e. making Gatherers specified more 
> loosely. Developers will write the code that they write, but if something 
> isn't behaving as expected, it is important to know which side to debug—the 
> library or the user code.
> 
> > I've written a `concat` Gatherer being blissfully unaware that it was not 
> >compliant, others have written non-reusable Gatherers as well: they exist 
> >and things like `concat` and `zip` are natural/intuitive use cases for 
> >Gatherers. 
> 
> The ability for developers to implement interfaces (knowingly or unknowingly) 
> in non-spec-conforming ways aside, are you suggesting that we add a section 
> in the Gatherer (and Collector) in the form of "Implementor Notes" that are a 
> bit more high-level than the specification?
> 
> Cheers,
> 
> √
> 
> **Viktor Klang**
> Software Architect, Java Platform Group
> 
> Oracle
> 
> ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
> 
> **From:** Anthony Vanelverdinghe <d...@anthonyv.be>
> **Sent:** Thursday, 19 September 2024 20:57
> **To:** Viktor Klang <viktor.kl...@oracle.com>; core-libs-dev@openjdk.org 
> <core-libs-dev@openjdk.org>
> **Subject:** [External] : Re: Stream Gatherers (JEP 473) feedback
>  
> 
> Hi Viktor
> 
> > Alas, there's no place where this could be enforced, users could have their 
> > own implementations of Stream (so cannot be enforced in Stream::gather).
> 
> The idea is to collect feedback, to see how many people report their 
> Gatherers being broken (i.e. their Gatherers being non-compliant without 
> realizing it), so enforcing it in `Stream::gather` is sufficient for this 
> purpose.
> 
> This hasn't come up before because it requires people (a) to read the 
> Javadoc, (b) to connect the dots and conclude "thus, a Gatherer must be 
> reusable", and (c) to be willing to invest their time in asking the question, 
> rather than moving on since their Gatherers "just work".
> 
> > java.util.stream.Stream does not explain the rationale for why it is 
> > single-use, Collector does not explain why they are reusable, why would 
> > Gatherers be held to a different standard?
> 
> For `Stream` the package Javadoc has statements like "No storage. A stream is 
> not a data structure" and "Possibly unbounded.", which is sufficient 
> rationale to me.
> 
> For `Collector`, unless I'm missing something, it does not actually specify 
> that it must be reusable, so it does not have to provide a rationale for it 
> either. Even if I did miss something and reusability is implied from the 
> specification: the question would likely never come up, because a Collector 
> will in practice always be reusable anyway (read: I can't readily think of a 
> sensible non-reusable Collector). This is unlike Gatherer, where some obvious 
> use cases such as `concat` and `zip` exist and people like me wonder why such 
> use cases are, apparently needlessly, prohibited by the Gatherer 
> specification.
> 
> > Think of it more like increasing the odds that users are given 
> > spec-conformant Gatherers.
> 
> Not sure I understand this argument? I'd argue that increasing those odds 
> would be done by allowing an additional category of Gatherers, not by 
> prohibiting it? I've written a `concat` Gatherer being blissfully unaware 
> that it was not compliant, others have written non-reusable Gatherers as 
> well: they exist and things like `concat` and `zip` are natural/intuitive use 
> cases for Gatherers. Gunnar wrote a blog post 
> [https://urldefense.com/v3/__https://www.morling.dev/blog/zipping-gatherer/__;!!ACWV5N9M2RV99hQ!KmRJAZ0OMfv5XrDKYFVNTJyVWBah899OR9tdZKUHJB928SXc6VEdT4ni1AHI_lGezKchV9kYO04XUdClsg$
>  ] about his `zip` Gatherer saying "Java 22 [...] promises to improve the 
> situation here." and none of his readers pointed out that his Gatherer is not 
> compliant either (nor complained that his Gatherer is not reusable).
> 
> Kind regards, Anthony
> 
> September 19, 2024 at 11:30 AM, "Viktor Klang" <viktor.kl...@oracle.com> 
> wrote:
> 
> >
> 
> > Hi Anthony,
> 
> >
> 
> > Bear with me for a moment,
> 
> >
> 
> > in the same vein as there's nothing which *enforces* equals(…) or 
> > hashCode() to be conformant to their specs, or any interface-implementation 
> > for that matter, I don't see how we could make any stronger enforcement of 
> > Gatherers.
> 
> >
> 
> > >My belief is that the subject of reusability hasn't come up before because 
> > >non-reusable Gatherers "just work": as long as instances of such Gatherers 
> > >are not reused, they don't lead to unexpected results or observable 
> > >differences in behavior. And so people have been implementing non-reusable 
> > >Gatherers such as `concat` and `zip` without realizing they aren't 
> > >compliant. Or maybe they did realize it, but didn't see the downside of 
> > >being non-compliant.
> 
> >
> 
> > Alas, there's no place where this could be enforced, users could have their 
> > own implementations of Stream (so cannot be enforced in Stream::gather). 
> > Ultimately, it all boils down to specification—if an equals(…)-method 
> > implementation leads to surprising behavior when used with a collection, 
> > one typically needs to first ensure that the equals(…)-method conforms to 
> > its expected specification before one can presume that the collection has a 
> > bug.
> 
> >
> 
> > For the "just work"—scenario, one can only make claims about things which 
> > have been proven. So in this case, what tests have passed for the 
> > implementation in question?
> 
> >
> 
> > >Which brings me to my next point: in case of (b), the Javadoc and/or JEP 
> > >should explain the rationale. Even to me it still seems like a needless 
> > >restriction. 
> 
> > java.util.stream.Stream does not explain the rationale for why it is 
> > single-use, Collector does not explain why they are reusable, why would 
> > Gatherers be held to a different standard?
> 
> >
> 
> > > "protecting the users from being given non-reusable Gatherers"
> 
> > Think of it more like increasing the odds that users are given 
> > spec-conformant Gatherers.
> 
> >
> 
> > Cheers,
> 
> >
> 
> > √
> 
> >
> 
> > **Viktor Klang**
> 
> > Software Architect, Java Platform Group
> 
> >
> 
> > Oracle
> 
> >
> 
> > ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
> 
> >
> 
> > **From:** Anthony Vanelverdinghe <d...@anthonyv.be>
> 
> > **Sent:** Wednesday, 18 September 2024 18:27
> 
> > **To:** Viktor Klang <viktor.kl...@oracle.com>; core-libs-dev@openjdk.org 
> > <core-libs-dev@openjdk.org>
> 
> > **Subject:** [External] : Re: Stream Gatherers (JEP 473) feedback
> 
> >  
> 
> >
> 
> > Hi Viktor
> 
> >
> 
> > Let me start with a question: is the requirement (a) "a Gatherer SHOULD be 
> > reusable", or (b) "a Gatherer MUST be reusable"?
> 
> >
> 
> > As of today the specification says (b), whereas the implementation matches 
> > (a).
> 
> >
> 
> > In case of (a), I propose to align the specification to allow for 
> > compliant, non-reusable Gatherers.
> 
> >
> 
> > In case of (b), I propose to align the implementation to enforce 
> > compliance. Something like:
> 
> >
> 
> > (1) invoke `initializer()` twice, giving `i1` and `i2`. Discard `i1` and 
> > invoke `i2` twice, giving `state1` and `state2`.
> 
> >
> 
> > (2) invoke `finisher()` twice, giving `f1` and `f2`. Discard `f1` and 
> > invoke `f2` twice, the first time with `state1` and a dummy Downstream, the 
> > second time with the actual final state, i.e. `state2` after all elements 
> > were integrated, and the actual Downstream.
> 
> >
> 
> > Then backport this change to JDK 23 & 22 and/or do another round of preview 
> > in JDK 24.
> 
> >
> 
> > I'm confident that enforcing compliance would result in significant amounts 
> > of feedback questioning the requirement.
> 
> >
> 
> > My belief is that the subject of reusability hasn't come up before because 
> > non-reusable Gatherers "just work": as long as instances of such Gatherers 
> > are not reused, they don't lead to unexpected results or observable 
> > differences in behavior. And so people have been implementing non-reusable 
> > Gatherers such as `concat` and `zip` without realizing they aren't 
> > compliant. Or maybe they did realize it, but didn't see the downside of 
> > being non-compliant.
> 
> >
> 
> > Which brings me to my next point: in case of (b), the Javadoc and/or JEP 
> > should explain the rationale. Even to me it still seems like a needless 
> > restriction. You say:
> 
> >
> 
> > > And I think the worst of all worlds would be a scenario where you, as a 
> > > user, are given a Gatherer<X,Y,Z> and you have no idea whether you can 
> > > re-use it or not.
> 
> >
> 
> > so I'd guess the rationale is "protecting the users from being given 
> > non-reusable Gatherers".
> 
> >
> 
> > However, I can't readily think of a situation where this would be essential.
> 
> >
> 
> > If a user creates a Gatherer by invoking a factory method, the factory 
> > method can specify whether its result is reusable.
> 
> >
> 
> > And if a user is given a Gatherer as a method argument, and they need the 
> > Gatherer to be reusable, they could change the parameter to a 
> > `Supplier<Gatherer>` instead.
> 
> >
> 
> > > >In a previous response you proposed using `Gatherer 
> > > >concat(Supplier<Stream<T>>)` instead, but then I'd just pass `() -> 
> > > >aStream`, wonder why the parameter isn't just a `Stream<T>`, and the 
> > > >Gatherer would still not be reusable.
> 
> >
> 
> > >
> 
> >
> 
> > > There's a very important, to me, difference between the two. In the 
> > > Stream-case, there exists 0 reusable usages. For the 
> > > Supplier<Stream>-case the implementation does not restrict re-usability, 
> > > but rather it is up to the caller to actively opt-out of reusability 
> > > (which could of course also be declared to be undefined behavior of the 
> > > implementor of said Gatherer). Local non-reusability decided by the 
> > > caller > Global non-reusability decided by the callee.
> 
> >
> 
> > We agree, just that I'd provide 2 factory methods, `concat(Stream<T>)` 
> > (non-reusable) and `append(List<T>)` (reusable), whereas you'd provide a 
> > 2-in-1 `concat(Supplier<Stream<T>>)`.
> 
> >
> 
> > Kind regards, Anthony
> 
> >
> 
> > September 12, 2024 at 11:55 PM, "Viktor Klang" <viktor.kl...@oracle.com> 
> > wrote:
> 
> >
> 
> > >
> 
> >
> 
> > > Hi Anthony
> 
> >
> 
> > >
> 
> >
> 
> > > Great questions! I had typed up a long response when my email client 
> > > decided the email was too large, crashed, and deleted my draft, so I'll 
> > > try to recreate what I wrote from memory.
> 
> >
> 
> > >
> 
> >
> 
> > > >While I understand that most Gatherers will be reusable, and that it's a 
> > > >desirable characteristic, surely there will also be non-reusable 
> > > >Gatherers?
> 
> >
> 
> > >
> 
> >
> 
> > > To me, this is governed by the following parts of the Gatherer 
> > > specification 
> > > https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.html
> > >  :
> 
> >
> 
> > >
> 
> >
> 
> > > "Each invocation of initializer() 
> > > https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.html#initializer()
> > >  
> > > ,integrator()https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.html#integrator()
> > >  
> > > ,combiner()https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.html#combiner()
> > >  , and 
> > > finisher()https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.html#finisher()
> > >   must return a semantically identical result."
> 
> >
> 
> > >
> 
> >
> 
> > > and
> 
> >
> 
> > >
> 
> >
> 
> > > "Implementations of Gatherer must not capture, retain, or expose to other 
> > > threads, the references to the state instance, or the 
> > > downstreamGatherer.Downstreamhttps://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.Downstream.html
> > >  
> > > PREVIEWhttps://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/util/stream/Gatherer.Downstream.html#preview-java.util.stream.Gatherer.Downstream
> > >   for longer than the invocation duration of the method which they are 
> > > passed to."
> 
> >
> 
> > >
> 
> >
> 
> > > And I think the worst of all worlds would be a scenario where you, as a 
> > > user, are given a Gatherer<X,Y,Z> and you have no idea whether you can 
> > > re-use it or not.
> 
> >
> 
> > >
> 
> >
> 
> > > For Stream, the assumption is that they are NOT reusable at all.
> 
> >
> 
> > > For Gatherer, I think the only reasonable assumption is that they are 
> > > reusable.
> 
> >
> 
> > >
> 
> >
> 
> > > >In particular, any Gatherer that is the result of a factory method with 
> > > >a `Stream<T>` parameter which supports infinite Streams, will be 
> > > >non-reusable, won't it?
> 
> >
> 
> > >
> 
> >
> 
> > > Not necessarily, if the factory method **consumes** the Stream and 
> > > creates a stable result which is reusable, then the resulting Gatherer is 
> > > reusable.
> 
> >
> 
> > >
> 
> >
> 
> > > >In a previous response you proposed using `Gatherer 
> > > >concat(Supplier<Stream<T>>)` instead, but then I'd just pass `() -> 
> > > >aStream`, wonder why the parameter isn't just a `Stream<T>`, and the 
> > > >Gatherer would still not be reusable.
> 
> >
> 
> > >
> 
> >
> 
> > > There's a very important, to me, difference between the two. In the 
> > > Stream-case, there exists 0 reusable usages. For the 
> > > Supplier<Stream>-case the implementation does not restrict re-usability, 
> > > but rather it is up to the caller to actively opt-out of reusability 
> > > (which could of course also be declared to be undefined behavior of the 
> > > implementor of said Gatherer). Local non-reusability decided by the 
> > > caller > Global non-reusability decided by the callee.
> 
> >
> 
> > >
> 
> >
> 
> > > >As another example, take Gunnar Morling's zip Gatherers: 
> 
> >
> 
> > >
> 
> >
> 
> > > I don't see how Gatherers like this could be made reusable, or why that 
> > > would even be desirable.
> 
> >
> 
> > >
> 
> >
> 
> > > Having been R&D-ing in the Stream-space more than a decade, I'm convinced 
> > > that there's no universally safe way to implement `zip` for push-style 
> > > stream designs. I'm happy to be proven wrong though, as that would open 
> > > up some interesting possibilities for things like Stream::iterator() and 
> > > Stream:spliterator().
> 
> >
> 
> > >
> 
> >
> 
> > > >My use case was about a pipeline where the concatenation comes somewhere 
> > > >in the middle of the pipeline.
> 
> >
> 
> > >
> 
> >
> 
> > > My apologies, I misunderstood. To me, the operation you describe is 
> > > called `inject`.
> 
> >
> 
> > > Given a stable (reusable) source of elements you can definitely implement 
> > > Gatherers which do before, during, or after-injections of elements to a 
> > > stream.
> 
> >
> 
> > >
> 
> >
> 
> > > Thanks again for the great questions and conversation, it's valuable!
> 
> >
> 
> > > Cheers,
> 
> >
> 
> > >
> 
> >
> 
> > > √
> 
> >
> 
> > >
> 
> >
> 
> > > **Viktor Klang**
> 
> >
> 
> > > Software Architect, Java Platform Group
> 
> >
> 
> > >
> 
> >
> 
> > > Oracle
> 
> >
> 
> > >
> 
> >
>

Reply via email to