[ https://issues.apache.org/jira/browse/FLINK-36295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882473#comment-17882473 ]
Matthias Pohl commented on FLINK-36295: --------------------------------------- There is a difference between the slow and the fast successful test run. In both cases, the DefaultStateTransitionManager transitions to Idling state initially when reaching the AdaptiveScheduler Executing state. The fast test run triggers the onChange before the checkpoint is triggered (which leads to Stabilizing Phase in the DefaultStateTransitionManager) whereas for the slow run, no onChange event is retrieved keeping the DefaultStateTransitionManager in Idling state > AdaptiveSchedulerClusterITCase. testCheckpointStatsPersistedAcrossRescale > failed with > -------------------------------------------------------------------------------------- > > Key: FLINK-36295 > URL: https://issues.apache.org/jira/browse/FLINK-36295 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 2.0-preview > Reporter: Matthias Pohl > Priority: Critical > Labels: test-stability > Attachments: > FLINK-36295.failure.62156.20240916.1.logs-cron_jdk17-test_cron_jdk17_core-1726454552.log > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=62156&view=logs&j=675bf62c-8558-587e-2555-dcad13acefb5&t=5878eed3-cc1e-5b12-1ed0-9e7139ce0992&l=10234 > {code} > Sep 16 03:06:30 03:06:30.168 [ERROR] Tests run: 3, Failures: 0, Errors: 1, > Skipped: 0, Time elapsed: 5.275 s <<< FAILURE! -- in > org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerClusterITCase > Sep 16 03:06:30 03:06:30.168 [ERROR] > org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerClusterITCase.testCheckpointStatsPersistedAcrossRescale > -- Time elapsed: 0.676 s <<< ERROR! > Sep 16 03:06:30 java.lang.IndexOutOfBoundsException: Index: -1 > Sep 16 03:06:30 at > java.base/java.util.Collections$EmptyList.get(Collections.java:4586) > Sep 16 03:06:30 at > org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerClusterITCase.testCheckpointStatsPersistedAcrossRescale(AdaptiveSchedulerClusterITCase.java:214) > Sep 16 03:06:30 at > java.base/java.lang.reflect.Method.invoke(Method.java:568) > Sep 16 03:06:30 at > java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194) > Sep 16 03:06:30 at > java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) > Sep 16 03:06:30 at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182) > Sep 16 03:06:30 at > java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) > Sep 16 03:06:30 at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) > Sep 16 03:06:30 at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) > Sep 16 03:06:30 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)