Hi Fabian,

Thanks for your patience, it took a bit of time to swap back in my thoughts on 
the matter, as I was considering this JBS issue as I was working on Gatherers 
(JEP461, JEP473, JEP485).

While it may look enticing to merely propagate expected element count as an 
input parameter to the supplier function,
I think it deserves some extra thought, specifically if it may make more sense 
to pass some sort of StreamInfo type which can provide more metadata in the 
future.

Another open question is how to propagate this information through Gatherers 
(i.e. a bigger scope than Collector-augmentation) to enable more sophisticated 
optimizations—because ultimately the availability of the information throughout 
the pipeline is going to be important for Collector.


Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Fabian 
Meumertzheim <fab...@buildbuddy.io>
Sent: Wednesday, 12 February 2025 11:09
To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>
Subject: JDK-8072840: Presizing for Stream Collectors

As an avid user of Guava's ImmutableCollections, I have been
interested in ways to close the efficiency gap between the built-in
`Stream#toList()` and third-party `Collector` implementations such as
`ImmutableList#toImmutableList()`. I've found the biggest problem to
be the lack of sizing information in `Collector`s, which led to me to
draft a solution to JDK-8072840:
https://github.com/openjdk/jdk/pull/23461

The benchmark shows pretty significant gains for sized streams that
mostly reshape data (e.g. slice records or turn a list into a map by
associating keys), which I've found to be a pretty common use case.

Before I formally send out the PR for review, I would like to gather
feedback on the design aspects of it (rather than the exact
implementation). I will thus leave it in draft mode for now, but
invite anyone to comment on it or on this thread.

Fabian

Reply via email to