Makes sense. What do you guys think of the idiom of generate(supplierThatEventuallyReturnsNull) + takeWhile() ? Should it be avoided?
On Wed, Mar 4, 2026 at 6:59 AM Viktor Klang <[email protected]> wrote: > >In our codebase, I see some developers using iterate() + takeWhile() and > others using generate() + takeWhile(). I am debating whether to raise a > concern about this pattern. Most likely, people won't insert intermediary > operations between them, and I worry I might be overthinking it. > > In this specific case I'd argue that it's more correct (and more > performant, and less code) to just use the 3-arg iterate. > > >or should I reconsider my warnings about side effects being rearranged in > sequential streams? > > Personally I prefer my Streams correct regardless of underlying > implementation and regardless of whether the stream isParallel() or not. > On 2026-03-03 20:29, Jige Yu wrote: > > Hi Viktor, > > Thanks for the explanation! > > I also experimented with adding parallel() in the middle, and it indeed > threw a NullPointerException even without distinct(). > > In our codebase, I see some developers using iterate() + takeWhile() and > others using generate() + takeWhile(). I am debating whether to raise a > concern about this pattern. Most likely, people won't insert intermediary > operations between them, and I worry I might be overthinking it. > > However, generate(supplierThatMayReturnNull).takeWhile() seems even more > precarious. Since generate() is documented as unordered, could it > potentially return elements out of encounter order, such as swapping a > later null with an earlier non-null return? > > This brings me back to the rationale I’ve used to discourage side effects > in map() and filter(). In a sequential stream, I’ve argued that relying on > side effects from an earlier map() to be visible in a subsequent map() is > unsafe because the stream is theoretically free to process multiple > elements through the first map() before starting the second. > > Is that view too pedantic? If we can safely assume iterate() + takeWhile() > is stable in non-parallel streams, should the same logic apply to > subsequent map() calls with side effects (style issues aside)? > > I’m trying to find a consistent theory. Should I advise my colleagues that > iterate() + takeWhile() and generate() + takeWhile() are unsafe, or should > I reconsider my warnings about side effects being rearranged in sequential > streams? > > I hope that clarifies the root of my confusion. > > Best, > Jige Yu > > On Mon, Mar 2, 2026 at 6:08 AM Viktor Klang <[email protected]> > wrote: > >> Hi Jige, >> >> I think I understand what you mean. In this case you're trying to prevent >> a `null`-return from `nextOrNull()` to be fed into the next iteration and >> thus throwing a NullPointerException. >> >> Now the answer is going to be a bit nuanced than you might want to hear, >> but in the spirit of providing clarity, the code which you provided will >> "work" under the assumption that there is no "buffer" in between iterate(…) >> and takeWhile(…). >> >> TL;DR: use Stream.iterate(seed, e -> e != null, e -> e.nextOrNull()) >> Long version: >> Imagine we have the following: >> ```java >> record E(E e) {} >> Stream.iterate(new E(new E(new E(null))), e -> e.e()) >> .< /span>takeWhile(Objects::nonNull) >> .forEach(IO::println) >> ``` >> We get: >> ```java >> E[e=E[e=E[e=null]]] >> E[e=E[e=null]] >> E[e=null] >> ``` >> However, if we do: >> ```java >> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >> .gather( >> Gatherer.<E,ArrayList<E>,E>ofSequential( >> ArrayList::new, >> (l, e, _) -> l.add(e), >> (l, d) -> l.forEach(d::push) >> ) >> ) >> .takeWhile(Objects::nonNull) >> .forEach(IO::println) >> ``` >> We get: >> ```java >> Exception java.lang.NullPointerException: Cannot invoke >> "REPL.$JShell$16$E.e()" because "<parameter1>" is null >> at lambda$do_it$$0 (#5:1) >> at Stream$1.tryAdvance (Stream.java:1515) >> at ReferencePipeline.forEachWithCancel (ReferencePipeline.java:147) >> at AbstractPipeline.copyIntoWithCancel (AbstractPipeline.java:588) >> at AbstractPipeline.copyInto (AbstractPipeline.java:574) >> at AbstractPipeline.wrapAndCopyInto (AbstractPipeline.java:560) >> at ForEachOps$ForEachOp.evaluateSequential (ForEachOps.java:153) >> at ForEachOps$ForEachOp$OfRef.evaluateSequential (ForEachOps.java:176) >> at AbstractPipeline.evaluate (AbstractPipeline.java:265) >> at ReferencePipeline.forEach (ReferencePipeline.java:632) >> at (#5:9) >> ``` >> But if we introduce something like `distinct()` in between, it will >> "work" under sequential processing, >> but under parallel processing it might not, as the distinct operation >> will have to buffer *separately* from takeWhile: >> ```java >> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >> .distinct() >> .takeWhile(Objects::nonNull) >> .forEach(IO::println) >> ``` >> ```java >> E[e=E[e=E[e=null]]] >> E[e=E[e=null]] >> E[e=null] >> ``` >> Parallel: >> ```java >> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >> .parallel() >> .distinct() >> .takeWhile(Objects::nonNull) >> .forEach(IO::println) >> ``` >> ```java >> Exception java.lang.NullPointerException: Cannot invoke >> "REPL.$JShell$16$E.e()" because "<parameter1>" is null >> at lambda$do_it$$0 (#7:1) >> at Stream$1.tryAdvance (Stream.java:1515) >> at Spliterators$AbstractSpliterator.trySplit (Spliterators.java:1447) >> at AbstractTask.compute (AbstractTask.java:308) >> at CountedCompleter.exec (CountedCompleter.java:759) >> at ForkJoinTask.doExec (ForkJoinTask.java:511) >> at ForkJoinTask.invoke (ForkJoinTask.java:683) >> at ReduceOps$ReduceOp.evaluateParallel (ReduceOps.java:927) >> at DistinctOps$1.reduce (DistinctOps.java:64) >> at DistinctOps$1.opEvaluateParallelLazy (DistinctOps.java:110) >> at AbstractPipeline.sourceSpliterator (AbstractPipeline.java:495) >> at AbstractPipeline.evaluate (AbstractPipeline.java:264) >> at ReferencePipeline.forEach (ReferencePipeline.java:632) >> at (#7:4) >> ``` >> >> On 2026-03-01 06:29, Jige Yu wrote: >> >> Hi @core-libs-dev, >> I am looking to validate the following idiom: >> Stream.iterate(seed, e -> e.nextOrNull()) >> .takeWhile(Objects::nonNull); >> The intent is for the stream to call nextOrNull() repeatedly until it >> returns null. However, I am concerned about where the Stream specification >> guarantees the correctness of this approach regarding happens-before >> relationships. >> The iterate() Javadoc defines happens-before for the function passed to >> it, stating that the action of applying f for one element happens-before >> the action of applying it for subsequent elements. However, it seems silent >> on the happens-before relationship with downstream operations like >> takeWhile(). >> My concern stems from the general discouragement of side effects in >> stream operations. For example, relying on side effects between subsequent >> map() calls is considered brittle because a stream might invoke the first >> map() on multiple elements before the second map() processes the first >> element. >> If this theory holds, is there anything theoretically preventing >> iterate() from generating multiple elements before takeWhile() evaluates >> the first one? I may be overthinking this, but I would appreciate your >> insights into why side effects are discouraged even in ordered, sequential >> streams and whether this specific idiom is safe. >> Appreciate your help! >> Best regards, >> Jige Yu >> >> -- >> Cheers, >> √ >> >> >> Viktor Klang >> Software Architect, Java Platform Group >> Oracle >> >> -- > Cheers, > √ > > > Viktor Klang > Software Architect, Java Platform Group > Oracle > >
