> From: "Viktor Klang" <viktor.kl...@oracle.com> > To: "Remi Forax" <fo...@univ-mlv.fr> > Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" > <psan...@openjdk.java.net> > Sent: Wednesday, January 24, 2024 2:34:11 PM > Subject: Re: Gatherer: spliterator characteristics are not propagated
> Presuming that you mean mutating the Gatherer such that its behavior isn't > stable, the difference (at least to me) is that creating such a mutable > Gatherer would violate the specification of Gatherer: [ > https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherer.java#L63 > | > https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherer.java#L63 > ] If there is a method characteristics, its spec will also indicates that a flag like STATELESS is only allowed but be used if the state is null. Not using STATELESS correctly will be also a violation of the spec. > Cheers, > √ regards, Rémi > Viktor Klang > Software Architect, Java Platform Group > Oracle > From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> > Sent: Tuesday, 23 January 2024 21:04 > To: Viktor Klang <viktor.kl...@oracle.com> > Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz > <psan...@openjdk.java.net> > Subject: [External] : Re: Gatherer: spliterator characteristics are not > propagated >> From: "Viktor Klang" <viktor.kl...@oracle.com> >> To: "Remi Forax" <fo...@univ-mlv.fr> >> Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" >> <psan...@openjdk.java.net> >> Sent: Monday, January 22, 2024 10:06:27 PM >> Subject: Re: Gatherer: spliterator characteristics are not propagated >>> The flags are in sync with the implementation because the only way to >>> create a >>> Gatherer if through the factory methods and those factory methods (and only >>> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A >>> user >>> should not be able to set those flags. Only the flags KEEP_* are settable >>> by a >>> user. >> But I presume this also requires to have a `int characteristics()`-method on >> the >> Gatherer interfacewhich means that users who are not using the factory >> methods >> will have full possibility of not only returning the flags, but returning any >> int. > The current implementation suffers the same kind of issue, it's easy to write > a > mutable Gatherer and change the functions after creation, worst, right in the > middle of a call to stream.gather(...). > Perhaps the Gatherer interface should be sealed ? We did not have that option > during the 1.8 timeframe, when the Collector API was created. >>> The stream implementation has a whole mechanism in place to >>> propagate/preverse >>> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of >>> this mechanism seems a little off topic. I would prefer the Gatherer to be a >>> good citizen and works seemlessly with the other intermediary operations. >> I can see where you're coming from here, but to me, adding API surface needs >> to >> pull its weight. >> In this case I wasn't convinced that it did, hence we're having this >> conversation. \uD83D\uDE42 >> Cheers, >> √ > regards, > Rémi >> Viktor Klang >> Software Architect, Java Platform Group >> Oracle >> From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> >> Sent: Monday, 22 January 2024 19:56 >> To: Viktor Klang <viktor.kl...@oracle.com> >> Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz >> <psan...@openjdk.java.net> >> Subject: [External] : Re: Gatherer: spliterator characteristics are not >> propagated >>> From: "Viktor Klang" <viktor.kl...@oracle.com> >>> To: "Remi Forax" <fo...@univ-mlv.fr> >>> Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" >>> <psan...@openjdk.java.net> >>> Sent: Monday, January 22, 2024 4:22:11 PM >>> Subject: Re: Gatherer: spliterator characteristics are not propagated >>> Hi Rémi, >> Hello, >>> For instance, stateless is neither recessive nor dominant, since the >>> composition >>> of two stateless operations is only ever stateless if they both are greedy >>> as >>> well: [ >>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java*L588__;Iw!!ACWV5N9M2RV99hQ!Lm52jd6kovd5t-cmrqqSLiRcIajBGXLxh85LO3eeiL6UxbKZuNPcUnO6z2i0FzMEoNr7U-cOBuWPCjo57FVW$ >>> | >>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java#L588 >>> ] >> Okay, so choosing SEQUENTIAL vs PARALELLIZABLE is not that important given >> that >> the combination is ad-hoc, reflecting the characterristics is enough. >>> So even if making it represented as ints (more like Spliterator, rather than >>> Collector) makes things faster, it's still both work to track, propagate, >>> and >>> also becomes a side-channel that needs to remain in sync with the actual >>> implementation of the logic. >> The flags are in sync with the implementation because the only way to create >> a >> Gatherer if through the factory methods and those factory methods (and only >> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A >> user >> should not be able to set those flags. Only the flags KEEP_* are settable by >> a >> user. >>> One could argue that logic such as: someCollection.stream().map(…).count() >>> is a >>> performance bug/inefficiency in an of itself as it would be faster to do >>> someCollection.size(). >> The stream implementation has a whole mechanism in place to >> propagate/preverse >> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of >> this mechanism seems a little off topic. I would prefer the Gatherer to be a >> good citizen and works seemlessly with the other intermediary operations. >> Cheers, >> √ >> Viktor Klang >> Software Architect, Java Platform Group >> Oracle >> From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> >> Sent: Monday, 22 January 2024 19:56 >> To: Viktor Klang <viktor.kl...@oracle.com> >> Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz >> <psan...@openjdk.java.net> >> Subject: [External] : Re: Gatherer: spliterator characteristics are not >> propagated >>> From: "Viktor Klang" <viktor.kl...@oracle.com> >>> To: "Remi Forax" <fo...@univ-mlv.fr> >>> Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" >>> <psan...@openjdk.java.net> >>> Sent: Monday, January 22, 2024 4:22:11 PM >>> Subject: Re: Gatherer: spliterator characteristics are not propagated >>> Hi Rémi, >> Hello, >>> For instance, stateless is neither recessive nor dominant, since the >>> composition >>> of two stateless operations is only ever stateless if they both are greedy >>> as >>> well: [ >>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java*L588__;Iw!!ACWV5N9M2RV99hQ!Lm52jd6kovd5t-cmrqqSLiRcIajBGXLxh85LO3eeiL6UxbKZuNPcUnO6z2i0FzMEoNr7U-cOBuWPCjo57FVW$ >>> | >>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/Gatherers.java#L588 >>> ] >> Okay, so choosing SEQUENTIAL vs PARALELLIZABLE is not that important given >> that >> the combination is ad-hoc, reflecting the characterristics is enough. >>> So even if making it represented as ints (more like Spliterator, rather than >>> Collector) makes things faster, it's still both work to track, propagate, >>> and >>> also becomes a side-channel that needs to remain in sync with the actual >>> implementation of the logic. >> The flags are in sync with the implementation because the only way to create >> a >> Gatherer if through the factory methods and those factory methods (and only >> them) compute the proper combination of SEQUENTIAL | STATELESS | GREEDY. A >> user >> should not be able to set those flags. Only the flags KEEP_* are settable by >> a >> user. >>> One could argue that logic such as: someCollection.stream().map(…).count() >>> is a >>> performance bug/inefficiency in an of itself as it would be faster to do >>> someCollection.size(). >> The stream implementation has a whole mechanism in place to >> propagate/preverse >> flags like SIZED, DISTINCT or SORTED. For me, discussing about the merit of >> this mechanism seems a little off topic. I would prefer the Gatherer to be a >> good citizen and works seemlessly with the other intermediary operations. >>> Cheers, >>> √ >> regards, >> Rémi >>> Viktor Klang >>> Software Architect, Java Platform Group >>> Oracle >>> From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> >>> Sent: Saturday, 20 January 2024 17:40 >>> To: Viktor Klang <viktor.kl...@oracle.com> >>> Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz >>> <psan...@openjdk.java.net> >>> Subject: [External] : Re: Gatherer: spliterator characteristics are not >>> propagated >>>> From: "Viktor Klang" <viktor.kl...@oracle.com> >>>> To: "Remi Forax" <fo...@univ-mlv.fr> >>>> Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" >>>> <psan...@openjdk.java.net> >>>> Sent: Thursday, January 18, 2024 5:14:38 PM >>>> Subject: Re: Gatherer: spliterator characteristics are not propagated >>>>> And for A.andThen(B), A.flags & B.flags should work, the stream is sorted >>>>> if >>>>> both gatherers keep it sorted. >>>> That is unfortunately not the case. That would presume that you can >>>> implement >>>> the composition such that it can preserve all the common flags. Some flags >>>> could be "dominant" and some "recessive" to use genetics nomenclature. >>> Some flags of the stream pipeline are "recessive", mainly SHORT_CIRCUIT, >>> but not >>> the characteristics of the Gatherer which can have the corresponding >>> "dominant" >>> flag, GREEDY, in this case. >>> And the same for sequential, the flag should be PARALELIZABLE and not >>> SEQUENTIAL. >>> The idea is that the Gatherer characteristics can have the same bit set at >>> the >>> same position as the stream op flags (as defined by StreamOpFlag). >>> So KEEP_DISTINCT is in position 0, KEEP_SORTED in in position 2 and >>> KEEP_SIZED >>> is in position 3. >>> For GREEDY, we use the same position as SHORT_CIRCUIT and we will flip the >>> bit >>> (using XOR) when we want to extract the stream op flags from the >>> characteristics >>> All the factory methods call the generic of() with a combination of >>> PARALELIZABLE and STATELESS and the user can adds the characteristics >>> GREEDY, >>> KEEP_DISTINCT, KEEP_SORTED and KEEP_SIZED (otherwise an exception should be >>> raised). >>> In StreamOpFlag, there are two unused positions (14 and 15), that's perfect >>> for >>> our two new states PARALELIZABLE and STATELESS, so no problem here >>> (technically >>> we can also reuse positions of the Spliterator characteristic given that >>> those >>> flags are masked before being sent to the GathererOp super constructor). >>> The way to transform a Gatherer characteristics op to a stream flags op is >>> to >>> flip the bits corresponding to SHORT_CIRCUIT, add the highter bit of all >>> flags >>> but SHORT-CIRCUIT (because stream op flags are encoded using 2 bits) and >>> mask >>> to only retain SHORT_CIRCUIT, KEEP_DISTINCT, KEEP_SORTED and KEEP_SIZED. >>> public static int toOpFlags ( int characteristics ) { >>> return (( characteristics ^ SHORT_CIRCUIT_MASK ) | HIGHER_BITS ) & >>> STREAM_OP_MASK ; >>> } >>> see below for a full script. >>>>> I suppose that if those flags already exist, it's because they have a >>>>> purpose >>>>> and i do not understand how it can make the other operations slower. >>>> Extra invocations, extra storage, and extra composition overhead is not >>>> free. >>>> Since Stream is one-shot you need to include the construction cost with the >>>> execution cost. For something like an empty Stream construction cost scan >>>> be >>>> 90+% of the total costs. >>> If you create a Gatherer, the characteristics is a constant (so the validity >>> check is removed, it's just a mask and a test) so the result of calling >>> toOpFlags() is a constant too. >>> If the factory method is not inlined, the cost is 3 bitwise operations >>> which is >>> I believe faster than the "instanceof Greedy" used in the current code. >>>> Cheers, >>>> √ >>> regards, >>> Rémi >>> --- >>> public class GathererFlag { >>> // cut and paste from StreamOpFlag >>> /** >>> * The bit pattern for setting/injecting a flag. >>> */ >>> private static final int SET_BITS = 0b01 ; >>> /** >>> * The bit pattern for clearing a flag. >>> */ >>> private static final int CLEAR_BITS = 0b10 ; >>> /** >>> * The bit pattern for preserving a flag. >>> */ >>> private static final int PRESERVE_BITS = 0b11 ; >>> private static int position ( int opFlagSet ) { >>> return Integer . numberOfTrailingZeros ( opFlagSet ) >> 1 ; >>> } >>> private static final int DISTINCT_POSITION = position ( StreamOpFlag . >>> IS_DISTINCT ); >>> private static final int SORTED_POSITION = position ( StreamOpFlag . >>> IS_SORTED >>> ); >>> private static final int SIZED_POSITION = position ( StreamOpFlag . >>> IS_SIZED ); >>> private static final int SHORT_CIRCUIT_POSITION = position ( StreamOpFlag . >>> IS_SHORT_CIRCUIT ); >>> private static final int STATELESS_POSITION = 14 ; >>> private static final int PARELLIZABLE_POSITION = 15 ; >>> public static final int PARELLIZABLE = SET_BITS << ( PARELLIZABLE_POSITION >>> << 1 >>> ); >>> public static final int STATELESS = SET_BITS << ( STATELESS_POSITION << 1 ); >>> public static final int GREEDY = SET_BITS << ( SHORT_CIRCUIT_POSITION << 1 >>> ); >>> public static final int KEEP_DISTINCT = SET_BITS << ( DISTINCT_POSITION << >>> 1 ); >>> public static final int KEEP_SORTED = SET_BITS << ( SORTED_POSITION << 1 ); >>> public static final int KEEP_SIZED = SET_BITS << ( SIZED_POSITION << 1 ); >>> private static final int SHORT_CIRCUIT_MASK = SET_BITS << ( >>> SHORT_CIRCUIT_POSITION << 1 ); >>> // no GREEDY here >>> private static final int HIGHER_BITS = ( PARELLIZABLE | STATELESS | >>> KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ) << 1 ; >>> private static final int STREAM_OP_MASK = >>> (( GREEDY | KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ) << 1 ) | >>> GREEDY | KEEP_DISTINCT | KEEP_SORTED | KEEP_SIZED ; >>> public static String toString ( int characteristics ) { >>> return Stream . of ( characteristics ) >>> .< String >mapMulti(( f , consumer ) -> { >>> if (( f & PARELLIZABLE ) == PARELLIZABLE ) { >>> consumer .accept( "PARELLIZABLE" ); >>> } >>> if (( f & STATELESS ) == STATELESS ) { >>> consumer .accept( "STATELESS" ); >>> } >>> if (( f & GREEDY ) == GREEDY ) { >>> consumer .accept( "GREEDY" ); >>> } >>> if (( f & KEEP_DISTINCT ) == KEEP_DISTINCT ) { >>> consumer .accept( "KEEP_DISTINCT" ); >>> } >>> if (( f & KEEP_SORTED ) == KEEP_SORTED ) { >>> consumer .accept( "KEEP_SORTED" ); >>> } >>> if (( f & KEEP_SIZED ) == KEEP_SIZED ) { >>> consumer .accept( "KEEP_SIZED" ); >>> } >>> }) >>> .collect( Collectors . joining ( ", " )); >>> } >>> public static int toOpFlags ( int characteristics ) { >>> return (( characteristics ^ SHORT_CIRCUIT_MASK ) | HIGHER_BITS ) & >>> STREAM_OP_MASK ; >>> } >>> public static String toOpFlagsString ( int opFlags ) { >>> return Arrays . stream ( StreamOpFlag . values ()) >>> .map( op -> { >>> if ( op .isPreserved( opFlags )) { >>> return "preserved " + op ; >>> } >>> if ( op .isCleared( opFlags )) { >>> return "cleared " + op ; >>> } >>> if ( op .isKnown( opFlags )) { >>> return "set " + op ; >>> } >>> return "?? " + op ; >>> }) >>> .collect( Collectors . joining ( ", " )); >>> } >>> void main () { >>> var characteristics = PARELLIZABLE | STATELESS | GREEDY | KEEP_DISTINCT | >>> KEEP_SORTED | KEEP_SIZED ; >>> System . out .println( toOpFlagsString ( toOpFlags ( characteristics ))); >>> var characteristics2 = STATELESS | KEEP_DISTINCT | KEEP_SIZED ; >>> System . out .println( toOpFlagsString ( toOpFlags ( characteristics2 ))); >>> } >>> } >>>> Viktor Klang >>>> Software Architect, Java Platform Group >>>> Oracle >>>> From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> >>>> Sent: Thursday, 18 January 2024 16:17 >>>> To: Viktor Klang <viktor.kl...@oracle.com> >>>> Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz >>>> <psan...@openjdk.java.net> >>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not >>>> propagated >>>>> From: "Viktor Klang" <viktor.kl...@oracle.com> >>>>> To: "Remi Forax" <fo...@univ-mlv.fr> >>>>> Cc: "core-libs-dev" <core-libs-...@openjdk.java.net>, "Paul Sandoz" >>>>> <psan...@openjdk.java.net> >>>>> Sent: Thursday, January 18, 2024 3:36:07 PM >>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated >>>>> I suspect that it is a rather slippery slope, once KEEP-flags are added, >>>>> then >>>>> others will want to be able to have INJECT-flags, and then people might >>>>> have >>>>> different opinions w.r.t. the default should be to clear all flags etc. >>>>> And that's even before one looks at the composition-part of it, what are >>>>> the >>>>> flags for A.andThen(B)? (then extrapolate to N compositions and the >>>>> available >>>>> set of flags always approaches 0) >>>>> I spent quite a bit of time on this and in the end tracking all this >>>>> info, and >>>>> making sure that the flags of implementations correspond to the actual >>>>> behavior, just ended up costing performance for most streams and >>>>> introduced an >>>>> extra dimension to creation and maintenance which I had a hard time >>>>> justifying. >>>> It can be a slippery slope if we were designing from the ground up but the >>>> stream implementation already exists and SORTED, DISTINCT and SIZED are the >>>> flags that are already tracked by the current implementation. >>>> Currently only SHORT_CIRCUIT is set (if not greedy), >>>> see [ >>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/GathererOp.java*L209__;Iw!!ACWV5N9M2RV99hQ!PhMxqlDzLWPRuwYc7ECRKNPVs0BtnoE-RdT-Jdkng7S-iFuERAHYcWvJ-OMKGLrkPdSrUl3xj1R9ypyeqeWI$ >>>> | >>>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/stream/GathererOp.java#L209 >>>> ] >>>> And for A.andThen(B), A.flags & B.flags should work, the stream is sorted >>>> if >>>> both gatherers keep it sorted. >>>>> Making specific, rare, combinations of operations faster at the expense of >>>>> making 99% of all others slower is a hard pill for most to swallow. >>>> I suppose that if those flags already exist, it's because they have a >>>> purpose >>>> and i do not understand how it can make the other operations slower. >>>>> Cheers, >>>>> √ >>>> regards, >>>> Rémi >>>>> Viktor Klang >>>>> Software Architect, Java Platform Group >>>>> Oracle >>>>> From: fo...@univ-mlv.fr <fo...@univ-mlv.fr> >>>>> Sent: Thursday, 18 January 2024 10:28 >>>>> To: Viktor Klang <viktor.kl...@oracle.com> >>>>> Cc: core-libs-dev <core-libs-...@openjdk.java.net>; Paul Sandoz >>>>> <psan...@openjdk.java.net> >>>>> Subject: [External] : Re: Gatherer: spliterator characteristics are not >>>>> propagated >>>>>> From: "Viktor Klang" <viktor.kl...@oracle.com> >>>>>> To: "Remi Forax" <fo...@univ-mlv.fr>, "core-libs-dev" >>>>>> <core-libs-...@openjdk.java.net> >>>>>> Sent: Wednesday, January 17, 2024 8:48:07 PM >>>>>> Subject: Re: Gatherer: spliterator characteristics are not propagated >>>>>> Hi Rémi, >>>>>> You can find some of my benches here: [ >>>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref__;!!ACWV5N9M2RV99hQ!JJy6F9NoL6wKZQK5158up_fTRvH8X7F6JK8T7Euuf8vzbSQbr23eWa9S_yb61ksONVrLrdesCF_au5zyje2l$ >>>>>> | >>>>>> https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref >>>>>> ] >>>>>> Initially I had Characteristics such as ORDERED etc on Gatherer but it >>>>>> just >>>>>> didn't end up worth it when looking at the bench results over a wide >>>>>> array of >>>>>> stream sizes and number of operations. >>>>> I think there are 3 gatherer characteristics that make sense: KEEP_SORTED, >>>>> KEEP_DISTINCT and KEEP_SIZED, >>>>> all of them say that if the stream was sorted/distinct/sized then the >>>>> stream >>>>> returned by a call to gather() is still sorted (with the same comparator), >>>>> distinct or sized. >>>>> As examples, map() is KEEP_SIZED, filter() is KEEP_SORTED | KEEP_DISTINCT >>>>> and >>>>> windowFixed is KEEP_DISTINCT. >>>>> [CC Paul, so he can correct me if i'm saying something stupid] >>>>> Now for the benchmarks, it depends what you want to measure, benchmarking >>>>> streams is tricky. This is what i know about benchmarking streams. >>>>> First, the JIT has two ways to profile types at runtime, >>>>> Either a method takes a function as parameter >>>>> void map(Function function) { >>>>> function.apply(...) >>>>> } >>>>> and when map is called with a subtype of Function, the JIT will propagate >>>>> the >>>>> exact type when map is inlined, >>>>> Or a method use a field >>>>> class Op { >>>>> Function function; >>>>> void map() { >>>>> function.apply(...) >>>>> } >>>>> } >>>>> in that case, the VM records the profile of function.apply() and if there >>>>> are >>>>> more than two different profiles, the VM declare profile poluttion and do >>>>> not >>>>> try to optimize. >>>>> The Stream implementation tries very hard to use only parameters instead >>>>> of >>>>> fields, that's why it does not use classical Iterator that are pull >>>>> iterator (a >>>>> filter iterator requires a field) but a Spliterator which is a push >>>>> iterator, >>>>> the element is sent as parameter of the consumer.That's also why collect >>>>> does >>>>> not use the builder pattern (that accumulate values in fields) but a >>>>> Collector >>>>> that publish the functions to be called as parameter. >>>>> Obvisously, this is more complex than that, a Collector stores the >>>>> functions in >>>>> fields so it should not work well but the implementation uses a record >>>>> that >>>>> plays well with escape analysis. Escape analysis see the fields of an >>>>> instance >>>>> as parameters so the functions of a Collector are correctly propagated >>>>> (if the >>>>> escape analysis works). And lambdas are using invokedynamic, and the VM >>>>> tries >>>>> really hard to inline invokedynamic, so lambdas (that captures value or >>>>> not) >>>>> are routinely fully inlined with the intermediate operation of a stream. >>>>> In your tests, i've not seen comparaisons between an existing method like >>>>> map() >>>>> or filter() followed by a sorted()/distinct()/count()/toList(), i.e. >>>>> operations >>>>> where the characteristics KEEP_* have an impact and their equivalent >>>>> using a >>>>> Gatherer. >>>>>> Cheers, >>>>>> √ >>>>> regards, >>>>> Rémi >>>>>> Viktor Klang >>>>>> Software Architect, Java Platform Group >>>>>> Oracle >>>>>> From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Remi >>>>>> Forax >>>>>> <fo...@univ-mlv.fr> >>>>>> Sent: Wednesday, 17 January 2024 16:48 >>>>>> To: core-libs-dev <core-libs-...@openjdk.java.net> >>>>>> Subject: Gatherer: spliterator characteristics are not propagated >>>>>> While doing some benchmarking of the Gatherer API, i've found that the >>>>>> characteristics of the spliterator was not propagated by the method >>>>>> Stream.gather() making the stream slower than it should. >>>>>> As an example, there is no way when reimplementing map() using a >>>>>> Gatherer to say >>>>>> that this intermediate operation keep the size, which is important if the >>>>>> terminal operation is toList() because if the size is known, toList() >>>>>> will >>>>>> presize the List and avoid the creation of an intermediary ArrayList. >>>>>> See [ >>>>>> https://urldefense.com/v3/__https://github.com/forax/we_are_all_to_gather/blob/master/src/main/java/com/gihtub/forax/wearealltogather/bench/MapGathererBenchmark.java__;!!ACWV5N9M2RV99hQ!JJy6F9NoL6wKZQK5158up_fTRvH8X7F6JK8T7Euuf8vzbSQbr23eWa9S_yb61ksONVrLrdesCF_auzwTY8aB$ >>>>>> | >>>>>> https://github.com/forax/we_are_all_to_gather/blob/master/src/main/java/com/gihtub/forax/wearealltogather/bench/MapGathererBenchmark.java >>>>>> ] >>>>>> I think that adding a way to propagate the spliterator characteristics >>>>>> through a >>>>>> Gatherer would greatly improve the performance of commons streams (at >>>>>> least all >>>>>> the ones that end with a call to toList). >>>>>> I have some idea of how to do that, but I prefer first to hear if i've >>>>>> overlook >>>>>> something and if improving the performance is something worth changing >>>>>> the API. >>>>>> regards, >>>>>> Rémi