On Fri, 10 Nov 2023 09:38:51 GMT, Tagir F. Valeev <tval...@openjdk.org> wrote:

>> Viktor Klang has updated the pull request incrementally with two additional 
>> commits since the last revision:
>> 
>>  - Addressing review feedback
>>  - Make Gatherer.andThen take a wildcard for the rhs Gatherer state type
>
> src/java.base/share/classes/java/util/stream/GathererOp.java line 301:
> 
>> 299:      * the output.  This is highly beneficial in the parallel case as 
>> stateful
>> 300:      * operations cannot be pipelined in the ReferencePipeline 
>> implementation.
>> 301:      * Overriding collect-operations overcomes this limitation.
> 
> Does this mean that .parallel().gather(myGatherer).map(smth).collect(..) will 
> be slower than 
> .parallel().gather(myGatherer.andThen(mappingGatherer(smth))).collect(..)? 
> Also, what about other terminals (e.g. reduce())? Will they require 
> processing all the upstream before reduction?

That will completely depend on many different factors, including the 
implementation of Stream (parallel is advisory only), the nature of the source 
Spliterator, the implementation of `map` and the implementation of `collect` 
(besides environmental factors such as availability of memory, CPU, etc).

Currently, in this PR, GathererOP is a "stateful" operation (in the reference 
implementation parlance), although consecutive `gather()`- operations are 
fused, and a `collect` after a `gather` is fused.

There is a combination of characteristics of Gatherers where they could be 
encoded as Spliterators, and in that case it could be conceived that they could 
be considered stateless. This is something which can be explored during the 
Preview.

There are multiple microbenchmarks in this PR for those who are curious about 
current performance.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16420#discussion_r1389216765

Reply via email to