This is an automated email from the ASF dual-hosted git repository.
He-Pin pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pekko.git
The following commit(s) were added to refs/heads/main by this push:
new e72dd27236 fix: stabilise JDK 21+ / JDK 25 nightly test runs (#2889)
e72dd27236 is described below
commit e72dd27236cf7ec98cc4736a6b36e1f7e6eda649
Author: He-Pin(kerr) <[email protected]>
AuthorDate: Thu Apr 23 17:09:14 2026 +0800
fix: stabilise JDK 21+ / JDK 25 nightly test runs (#2889)
* fix: stabilise JDK 21+ / JDK 25 nightly test runs
Motivation:
The JDK 21+ ForkJoinPool compensation-thread regression (JDK-8300995 /
JDK-8321335) starves Pekko's actor and remote dispatchers during heavy
ask/await workloads in tests, producing intermittent timeouts on the
nightly matrix (see #2573, #2870). Several individual tests also rely
on hardcoded timeouts that never scale with `pekko.test.timefactor`,
so they flake even when the dispatcher itself is healthy.
Modification:
- nightly-builds.yml: when the JDK is 21 or newer, raise
`fork-join-executor.minimum-runnable` to 4 for both
`pekko.actor.default-dispatcher` and
`pekko.remote.default-remote-dispatcher`. This pushes the pool to
spawn compensation threads earlier and avoids long stalls under the
new compensation policy without changing scheduling semantics
(FIFO unchanged, no fairness regressions).
- TlsSpec: dilate the previously hardcoded 15s/17s timeouts in the
ServerInitiatesViaTcp / CancellingRHSIgnoresBoth scenario so the
timefactor actually applies on slower CI workers.
- EventSourcedStashOverflowSpec: pass an explicit 30s budget to
`receiveMessages(stashCapacity)` (the default 12s on JDK 25 is not
enough to drain 20k messages on slow runners).
- SteppingInmemJournal.step: bump the per-step ask timeout from 3s
(dilated) to 10s (dilated) so PersistentActorRecoveryTimeoutSpec and
similar tests retain headroom on JDK 17 with timefactor=2.
Result:
JDK 21+ nightly runs stop hitting compensation-thread starvation in the
artery / remoting suites, and the four targeted timing fixes remove
specific flakes observed in the latest nightly run on JDK 17 / 25.
Production defaults are unchanged - the FJP override is applied only
in the workflow.
* test: scale FlowMapAsyncPartitionedSpec patience with test timefactor
Motivation:
The "resume after multiple failures if resume supervision is in place"
case times out after the default 6-second ScalaFutures patience on the
JDK 25 nightly matrix. The sibling suite
stream-typed-tests MapAsyncPartitionedSpec received the same dilation
treatment in #2884; this spec was missed.
Modification:
Override the spec's patienceConfig to use a dilated 30-second timeout
so the whole suite tracks pekko.test.timefactor instead of a fixed 6s.
Result:
Supervised-resume and bulk-throughput cases no longer flake on JDK 25
CI when the environment is under contention.
---
.github/workflows/nightly-builds.yml | 15 +++++++--------
.../typed/scaladsl/EventSourcedStashOverflowSpec.scala | 2 +-
.../pekko/persistence/journal/SteppingInmemJournal.scala | 2 +-
.../test/scala/org/apache/pekko/stream/io/TlsSpec.scala | 5 +++--
.../stream/scaladsl/FlowMapAsyncPartitionedSpec.scala | 9 +++++++++
5 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/.github/workflows/nightly-builds.yml
b/.github/workflows/nightly-builds.yml
index fd941e7b99..5d4ff1bbe6 100644
--- a/.github/workflows/nightly-builds.yml
+++ b/.github/workflows/nightly-builds.yml
@@ -148,20 +148,18 @@ jobs:
- name: Compile and Test
# note that this is not running any multi-jvm tests because
multi-in-test=false
- # JDK 25 ForkJoinPool scheduling changes need a higher timefactor (see
#2573, #2870)
- # JDK 21+ uses virtual threads to bypass ForkJoinPool
compensation-thread starvation (JDK-8300995)
- # Virtual threads provide equivalent or better performance by
eliminating unmount-remount penalties
- # when blocking on I/O or reply futures, so TIMEFACTOR remains 2 for
JDK 21-24.
- # If timeouts occur in nightly runs, increase TIMEFACTOR to 3
immediately.
- env:
- # Enable virtual threads only on JDK 21+ (nightly build only)
- PEKKO_VIRTUALIZE_DISPATCHER: ${{ matrix.javaVersion >= 21 && 'on' ||
'off' }}
+ # JDK 21+ ForkJoinPool scheduling changes need a higher timefactor and
a larger
+ # minimum-runnable to avoid compensation-thread starvation (see #2573,
#2870).
run: |-
+ EXTRA_JVM_OPTS=""
if [ "${{ matrix.javaVersion }}" -ge 25 ]; then
TIMEFACTOR=4
else
TIMEFACTOR=2
fi
+ if [ "${{ matrix.javaVersion }}" -ge 21 ]; then
+
EXTRA_JVM_OPTS="-Dpekko.actor.default-dispatcher.fork-join-executor.minimum-runnable=4
-Dpekko.remote.default-remote-dispatcher.fork-join-executor.minimum-runnable=4"
+ fi
sbt \
-Dpekko.cluster.assert=on \
-Dpekko.log.timestamps=true \
@@ -170,6 +168,7 @@ jobs:
-Dpekko.test.tags.exclude=gh-exclude,timing \
-Dpekko.test.multi-in-test=false \
-Dio.netty.leakDetection.level=PARANOID \
+ $EXTRA_JVM_OPTS \
clean "++ ${{ matrix.scalaVersion }} test" checkTestsHaveRun
- name: Docs
diff --git
a/persistence-typed-tests/src/test/scala/org/apache/pekko/persistence/typed/scaladsl/EventSourcedStashOverflowSpec.scala
b/persistence-typed-tests/src/test/scala/org/apache/pekko/persistence/typed/scaladsl/EventSourcedStashOverflowSpec.scala
index 0040ce6556..636f15d45a 100644
---
a/persistence-typed-tests/src/test/scala/org/apache/pekko/persistence/typed/scaladsl/EventSourcedStashOverflowSpec.scala
+++
b/persistence-typed-tests/src/test/scala/org/apache/pekko/persistence/typed/scaladsl/EventSourcedStashOverflowSpec.scala
@@ -98,7 +98,7 @@ class EventSourcedStashOverflowSpec
SteppingInmemJournal.step(journal)
// exactly how many is racy but at least the first stash buffer full
should complete
- probe.receiveMessages(stashCapacity)
+ probe.receiveMessages(stashCapacity, 30.seconds)
}
}
diff --git
a/persistence/src/test/scala/org/apache/pekko/persistence/journal/SteppingInmemJournal.scala
b/persistence/src/test/scala/org/apache/pekko/persistence/journal/SteppingInmemJournal.scala
index 6b2c46e335..295b8cf8fa 100644
---
a/persistence/src/test/scala/org/apache/pekko/persistence/journal/SteppingInmemJournal.scala
+++
b/persistence/src/test/scala/org/apache/pekko/persistence/journal/SteppingInmemJournal.scala
@@ -38,7 +38,7 @@ object SteppingInmemJournal {
* Allow the journal to do one operation, will block until that completes
*/
def step(journal: ActorRef)(implicit system: ActorSystem): Unit = {
- implicit val timeout: Timeout = 3.seconds.dilated
+ implicit val timeout: Timeout = 10.seconds.dilated
Await.result(journal ? SteppingInmemJournal.Token, timeout.duration)
}
diff --git
a/stream-tests/src/test/scala/org/apache/pekko/stream/io/TlsSpec.scala
b/stream-tests/src/test/scala/org/apache/pekko/stream/io/TlsSpec.scala
index c018aa48c4..10f3655f81 100644
--- a/stream-tests/src/test/scala/org/apache/pekko/stream/io/TlsSpec.scala
+++ b/stream-tests/src/test/scala/org/apache/pekko/stream/io/TlsSpec.scala
@@ -440,11 +440,12 @@ class TlsSpec extends StreamSpec(TlsSpec.configOverrides)
with WithLogCapturing
.collect { case SessionBytes(_, b) => b }
.scan(ByteString.empty)(_ ++ _)
.filter(_.nonEmpty)
- .via(new Timeout(15.seconds))
+ .via(new Timeout(15.seconds.dilated))
.dropWhile(_.size < scenario.output.size)
.runWith(Sink.headOption)
- Await.result(output,
17.seconds).getOrElse(ByteString.empty).utf8String should
be(scenario.output.utf8String)
+ Await.result(output,
17.seconds.dilated).getOrElse(ByteString.empty).utf8String should be(
+ scenario.output.utf8String)
commPattern.cleanup()
}
diff --git
a/stream-tests/src/test/scala/org/apache/pekko/stream/scaladsl/FlowMapAsyncPartitionedSpec.scala
b/stream-tests/src/test/scala/org/apache/pekko/stream/scaladsl/FlowMapAsyncPartitionedSpec.scala
index 61bb9f3801..af8ba74235 100644
---
a/stream-tests/src/test/scala/org/apache/pekko/stream/scaladsl/FlowMapAsyncPartitionedSpec.scala
+++
b/stream-tests/src/test/scala/org/apache/pekko/stream/scaladsl/FlowMapAsyncPartitionedSpec.scala
@@ -30,11 +30,14 @@ import pekko.stream.Supervision
import pekko.stream.testkit._
import pekko.stream.testkit.scaladsl.TestSink
import pekko.stream.testkit.scaladsl.TestSource
+import pekko.testkit.TestDuration
import pekko.testkit.TestLatch
import pekko.testkit.TestProbe
import pekko.testkit.WithLogCapturing
import org.scalatest.compatible.Assertion
+import org.scalatest.time.Millis
+import org.scalatest.time.Span
// Tests ported from akka/akka-core#31582 and akka/akka-core#31882,
// adapted for Pekko's mapAsyncPartitioned API (no perPartition parameter).
@@ -42,6 +45,12 @@ import org.scalatest.compatible.Assertion
class FlowMapAsyncPartitionedSpec extends StreamSpec with WithLogCapturing {
import Utils.TE
+ // The default ScalaFutures patience of 6 seconds is too tight for the
supervised-resume
+ // and bulk-throughput cases in this suite under JDK 25 nightly contention.
Scale by the
+ // configured pekko.test.timefactor so CI dilation is honoured.
+ override implicit def patienceConfig: PatienceConfig =
+ PatienceConfig(timeout = 30.seconds.dilated, interval = Span(15, Millis))
+
"A Flow with mapAsyncPartitioned" must {
"produce future elements" in {
implicit val ec: ExecutionContext = system.dispatcher
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]