[
https://issues.apache.org/jira/browse/FLINK-36279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881992#comment-17881992
]
Matthias Pohl commented on FLINK-36279:
---------------------------------------
hm, the only reason I can come up with why the test might succeed is that the
job eventually finishes (because the source generated all values up to
{{Integer.MAX_VALUE}}). That would make the job complete. If the polling for
number of running tasks happens at the right time, we might detect 3 running
tasks (which is the goal of the scale down) and the test could proceed
detecting the slots that became available due to the job completion. The test
would succeed in that case.
I wasn't able to reproduce this scenario, though (50 test runs so far). That's
where I'm hesitant to believe that we actually run into this issue in the
FLINK-36014 PR CI.
> RescaleOnCheckpointITCase.testRescaleOnCheckpoint fails
> -------------------------------------------------------
>
> Key: FLINK-36279
> URL: https://issues.apache.org/jira/browse/FLINK-36279
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 2.0-preview
> Reporter: Matthias Pohl
> Assignee: Matthias Pohl
> Priority: Major
> Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=62105&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=11287
> {code}
> Sep 13 17:16:55 "ForkJoinPool-1-worker-25" #28 daemon prio=5 os_prio=0
> tid=0x00007f973f0c2800 nid=0x31a1 waiting on condition [0x00007f97089fc000]
> Sep 13 17:16:55 java.lang.Thread.State: TIMED_WAITING (sleeping)
> Sep 13 17:16:55 at java.lang.Thread.sleep(Native Method)
> Sep 13 17:16:55 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:152)
> Sep 13 17:16:55 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
> Sep 13 17:16:55 at
> org.apache.flink.test.scheduling.UpdateJobResourceRequirementsITCase.waitForRunningTasks(UpdateJobResourceRequirementsITCase.java:219)
> Sep 13 17:16:55 at
> org.apache.flink.test.scheduling.RescaleOnCheckpointITCase.testRescaleOnCheckpoint(RescaleOnCheckpointITCase.java:139)
> Sep 13 17:16:55 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> Sep 13 17:16:55 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [...]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)