>I could see that being useful for properties such as non-nullness, which would allow collections such as ImmutableList to skip the null check in the end.
I'm thinking things like ordered/unordered, whether the stream is parallel or not (might want to use different representation for a sequential stream), etc. >Do you think that there could be a need to pass stream information to anything other than the Gatherer's state initializer? Based on a cursory glance, it looks straightforward to pass the same info to it as to the Collector. If that's true and we go with a more extensible design than a plain long, Gatherers could be opted in in follow-up work. It's more involved than that—as Gatherers produce output, it would be necessary to devise a scheme which allows Gatherers to communicate upper and lower bounds on the output. This information would then need to be threaded through the chain of gatherers and emerge on the other side. This is slightly more involved than just communicating characteristics, since it is information based off of the stream and not merely the operation itself. Cheers, √ Viktor Klang Software Architect, Java Platform Group Oracle ________________________________ From: Fabian Meumertzheim <fab...@buildbuddy.io> Sent: Thursday, 13 February 2025 17:11 To: Viktor Klang <viktor.kl...@oracle.com> Cc: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org> Subject: [External] : Re: JDK-8072840: Presizing for Stream Collectors On Thu, Feb 13, 2025 at 3:06 PM Viktor Klang <viktor.kl...@oracle.com> wrote: > While it may look enticing to merely propagate expected element count as an > input parameter to the supplier function, > I think it deserves some extra thought, specifically if it may make more > sense to pass some sort of StreamInfo type which can provide more metadata in > the future. I could see that being useful for properties such as non-nullness, which would allow collections such as ImmutableList to skip the null check in the end. > Another open question is how to propagate this information through Gatherers > (i.e. a bigger scope than Collector-augmentation) to enable more > sophisticated optimizations—because ultimately the availability of the > information throughout the pipeline is going to be important for Collector. Do you think that there could be a need to pass stream information to anything other than the Gatherer's state initializer? Based on a cursory glance, it looks straightforward to pass the same info to it as to the Collector. If that's true and we go with a more extensible design than a plain long, Gatherers could be opted in in follow-up work. Best, Fabian > > > Cheers, > √ > > > Viktor Klang > Software Architect, Java Platform Group > Oracle > ________________________________ > From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Fabian > Meumertzheim <fab...@buildbuddy.io> > Sent: Wednesday, 12 February 2025 11:09 > To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org> > Subject: JDK-8072840: Presizing for Stream Collectors > > As an avid user of Guava's ImmutableCollections, I have been > interested in ways to close the efficiency gap between the built-in > `Stream#toList()` and third-party `Collector` implementations such as > `ImmutableList#toImmutableList()`. I've found the biggest problem to > be the lack of sizing information in `Collector`s, which led to me to > draft a solution to JDK-8072840: > https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/23461__;!!ACWV5N9M2RV99hQ!N-RbriJ93dED1WYLFxFZ4dD5oTx5wqPCPTmv4Oivm3IFJTHNwZ1v3d228Ifs8SdFJwcc7YZnCuNZXG9LmQ3ZCA4$ > > The benchmark shows pretty significant gains for sized streams that > mostly reshape data (e.g. slice records or turn a list into a map by > associating keys), which I've found to be a pretty common use case. > > Before I formally send out the PR for review, I would like to gather > feedback on the design aspects of it (rather than the exact > implementation). I will thus leave it in draft mode for now, but > invite anyone to comment on it or on this thread. > > Fabian