Hi Anthony, Thank you for your patience, and for providing feedback, it is always much appreciated.
>When writing factory methods for Gatherers, there's sometimes a degenerate case that requires returning a no-op Gatherer. So I'd like a way to mark a no-op Gatherer as such, allowing the Stream implementation to recognize and eliminate it from the pipeline. One idea is to add Gatherer.defaultIntegrator(), analogous to the other default… methods. Another is to add Gatherers.identity(), analogous to Function.identity(). I contemplated adding that but in the end I decided I didn't want to add it for the sake of adding it, but rather adding it in case it was deemed necessary. Do you have a concrete use-case (code) that you could share? >Sometimes a factory method returns a Gatherer that only works correctly if the upstream has certain characteristics, for example Spliterator.SORTED or Spliterator.DISTINCT. Do you have a concrete use-case (code) that you could share? >One idea is to add methods like Gatherers.sorted() and Gatherers.distinct(), where the Stream implementation would be able to recognize and eliminate these from the pipeline if the upstream already has these characteristics. That way we'd be able to write `return Gatherers.sorted().andThen(…);`. Another idea is to provide a Gatherer with a way to inspect the upstream characteristics. If the upstream is missing the required characteristic(s), it could then throw an IllegalStateException. For a rather long time Gatherer had characteristics, however, what I noticed is that given composition of Gatherers what ended up happening almost always was that the combination of characteristics added overhead and devolved into the empty set real fast. Also, when it comes to things like sorted() and distinct(), they (by necessity) have to get processed in full before emitting anything downstream, which creates a lot of extra memory allocation and doesn't lend themselves all that well to any depth-first streaming. >The returns clause of Gatherer.Integrator::integrate just states "true if subsequent integration is desired, false if not". In particular, it doesn't document the behavior I'm observing, that returning false also causes downstream to reject any further output elements. Do you have a test case? (There was a bug fixed in this area after 22 was released, so you may want to test it on a 23-ea) Cheers, √ Viktor Klang Software Architect, Java Platform Group Oracle ________________________________ From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Anthony Vanelverdinghe <d...@anthonyv.be> Sent: Saturday, 27 July 2024 08:57 To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org> Subject: Stream Gatherers (JEP 473) feedback When writing factory methods for Gatherers, there's sometimes a degenerate case that requires returning a no-op Gatherer. So I'd like a way to mark a no-op Gatherer as such, allowing the Stream implementation to recognize and eliminate it from the pipeline. One idea is to add Gatherer.defaultIntegrator(), analogous to the other default… methods. Another is to add Gatherers.identity(), analogous to Function.identity(). Sometimes a factory method returns a Gatherer that only works correctly if the upstream has certain characteristics, for example Spliterator.SORTED or Spliterator.DISTINCT. One idea is to add methods like Gatherers.sorted() and Gatherers.distinct(), where the Stream implementation would be able to recognize and eliminate these from the pipeline if the upstream already has these characteristics. That way we'd be able to write `return Gatherers.sorted().andThen(…);`. Another idea is to provide a Gatherer with a way to inspect the upstream characteristics. If the upstream is missing the required characteristic(s), it could then throw an IllegalStateException. The returns clause of Gatherer.Integrator::integrate just states "true if subsequent integration is desired, false if not". In particular, it doesn't document the behavior I'm observing, that returning false also causes downstream to reject any further output elements. In the Implementation Requirements section of Gatherer, rephrasing "Outputs and state later in the input sequence will be discarded if processing an earlier partition short-circuits." to something like the following would be clearer to me: "As soon as any partition short-circuits, the whole Gatherer short-circuits. The state of other partitions is discarded, i.e. there are no further invocations of the combiner. The finisher is invoked with the short-circuiting partition's state." I wouldn't mention discarding of outputs, since that's implied by the act of short-circuiting. Anthony