On Thu, Feb 13, 2025 at 3:06 PM Viktor Klang <viktor.kl...@oracle.com> wrote:
> While it may look enticing to merely propagate expected element count as an 
> input parameter to the supplier function,
> I think it deserves some extra thought, specifically if it may make more 
> sense to pass some sort of StreamInfo type which can provide more metadata in 
> the future.

I could see that being useful for properties such as non-nullness,
which would allow collections such as ImmutableList to skip the null
check in the end.

> Another open question is how to propagate this information through Gatherers 
> (i.e. a bigger scope than Collector-augmentation) to enable more 
> sophisticated optimizations—because ultimately the availability of the 
> information throughout the pipeline is going to be important for Collector.

Do you think that there could be a need to pass stream information to
anything other than the Gatherer's state initializer? Based on a
cursory glance, it looks straightforward to pass the same info to it
as to the Collector. If that's true and we go with a more extensible
design than a plain long, Gatherers could be opted in in follow-up
work.

Best,
Fabian

>
>
> Cheers,
> √
>
>
> Viktor Klang
> Software Architect, Java Platform Group
> Oracle
> ________________________________
> From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Fabian 
> Meumertzheim <fab...@buildbuddy.io>
> Sent: Wednesday, 12 February 2025 11:09
> To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>
> Subject: JDK-8072840: Presizing for Stream Collectors
>
> As an avid user of Guava's ImmutableCollections, I have been
> interested in ways to close the efficiency gap between the built-in
> `Stream#toList()` and third-party `Collector` implementations such as
> `ImmutableList#toImmutableList()`. I've found the biggest problem to
> be the lack of sizing information in `Collector`s, which led to me to
> draft a solution to JDK-8072840:
> https://github.com/openjdk/jdk/pull/23461
>
> The benchmark shows pretty significant gains for sized streams that
> mostly reshape data (e.g. slice records or turn a list into a map by
> associating keys), which I've found to be a pretty common use case.
>
> Before I formally send out the PR for review, I would like to gather
> feedback on the design aspects of it (rather than the exact
> implementation). I will thus leave it in draft mode for now, but
> invite anyone to comment on it or on this thread.
>
> Fabian

Reply via email to