>I could see that being useful for properties such as non-nullness,
which would allow collections such as ImmutableList to skip the null
check in the end.

I'm thinking things like ordered/unordered, whether the stream is parallel or 
not (might want to use different representation for a sequential stream), etc.

>Do you think that there could be a need to pass stream information to
anything other than the Gatherer's state initializer? Based on a
cursory glance, it looks straightforward to pass the same info to it
as to the Collector. If that's true and we go with a more extensible
design than a plain long, Gatherers could be opted in in follow-up
work.

It's more involved than that—as Gatherers produce output, it would be necessary 
to devise a scheme which allows Gatherers to communicate upper and lower bounds 
on the output. This information would then need to be threaded through the 
chain of gatherers and emerge on the other side. This is slightly more involved 
than just communicating characteristics, since it is information based off of 
the stream and not merely the operation itself.

Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle

________________________________
From: Fabian Meumertzheim <fab...@buildbuddy.io>
Sent: Thursday, 13 February 2025 17:11
To: Viktor Klang <viktor.kl...@oracle.com>
Cc: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>
Subject: [External] : Re: JDK-8072840: Presizing for Stream Collectors

On Thu, Feb 13, 2025 at 3:06 PM Viktor Klang <viktor.kl...@oracle.com> wrote:
> While it may look enticing to merely propagate expected element count as an 
> input parameter to the supplier function,
> I think it deserves some extra thought, specifically if it may make more 
> sense to pass some sort of StreamInfo type which can provide more metadata in 
> the future.

I could see that being useful for properties such as non-nullness,
which would allow collections such as ImmutableList to skip the null
check in the end.

> Another open question is how to propagate this information through Gatherers 
> (i.e. a bigger scope than Collector-augmentation) to enable more 
> sophisticated optimizations—because ultimately the availability of the 
> information throughout the pipeline is going to be important for Collector.

Do you think that there could be a need to pass stream information to
anything other than the Gatherer's state initializer? Based on a
cursory glance, it looks straightforward to pass the same info to it
as to the Collector. If that's true and we go with a more extensible
design than a plain long, Gatherers could be opted in in follow-up
work.

Best,
Fabian

>
>
> Cheers,
> √
>
>
> Viktor Klang
> Software Architect, Java Platform Group
> Oracle
> ________________________________
> From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Fabian 
> Meumertzheim <fab...@buildbuddy.io>
> Sent: Wednesday, 12 February 2025 11:09
> To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>
> Subject: JDK-8072840: Presizing for Stream Collectors
>
> As an avid user of Guava's ImmutableCollections, I have been
> interested in ways to close the efficiency gap between the built-in
> `Stream#toList()` and third-party `Collector` implementations such as
> `ImmutableList#toImmutableList()`. I've found the biggest problem to
> be the lack of sizing information in `Collector`s, which led to me to
> draft a solution to JDK-8072840:
> https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/23461__;!!ACWV5N9M2RV99hQ!N-RbriJ93dED1WYLFxFZ4dD5oTx5wqPCPTmv4Oivm3IFJTHNwZ1v3d228Ifs8SdFJwcc7YZnCuNZXG9LmQ3ZCA4$
>
> The benchmark shows pretty significant gains for sized streams that
> mostly reshape data (e.g. slice records or turn a list into a map by
> associating keys), which I've found to be a pretty common use case.
>
> Before I formally send out the PR for review, I would like to gather
> feedback on the design aspects of it (rather than the exact
> implementation). I will thus leave it in draft mode for now, but
> invite anyone to comment on it or on this thread.
>
> Fabian

Reply via email to