[ 
https://issues.apache.org/jira/browse/FLINK-36279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881992#comment-17881992
 ] 

Matthias Pohl commented on FLINK-36279:
---------------------------------------

hm, the only reason I can come up with why the test might succeed is that the 
job eventually finishes (because the source generated all values up to 
{{Integer.MAX_VALUE}}). That would make the job complete. If the polling for 
number of running tasks happens at the right time, we might detect 3 running 
tasks (which is the goal of the scale down) and the test could proceed 
detecting the slots that became available due to the job completion. The test 
would succeed in that case.

I wasn't able to reproduce this scenario, though (50 test runs so far). That's 
where I'm hesitant to believe that we actually run into this issue in the 
FLINK-36014 PR CI.

> RescaleOnCheckpointITCase.testRescaleOnCheckpoint fails
> -------------------------------------------------------
>
>                 Key: FLINK-36279
>                 URL: https://issues.apache.org/jira/browse/FLINK-36279
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 2.0-preview
>            Reporter: Matthias Pohl
>            Assignee: Matthias Pohl
>            Priority: Major
>              Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=62105&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=11287
> {code}
> Sep 13 17:16:55 "ForkJoinPool-1-worker-25" #28 daemon prio=5 os_prio=0 
> tid=0x00007f973f0c2800 nid=0x31a1 waiting on condition [0x00007f97089fc000]
> Sep 13 17:16:55    java.lang.Thread.State: TIMED_WAITING (sleeping)
> Sep 13 17:16:55       at java.lang.Thread.sleep(Native Method)
> Sep 13 17:16:55       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:152)
> Sep 13 17:16:55       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
> Sep 13 17:16:55       at 
> org.apache.flink.test.scheduling.UpdateJobResourceRequirementsITCase.waitForRunningTasks(UpdateJobResourceRequirementsITCase.java:219)
> Sep 13 17:16:55       at 
> org.apache.flink.test.scheduling.RescaleOnCheckpointITCase.testRescaleOnCheckpoint(RescaleOnCheckpointITCase.java:139)
> Sep 13 17:16:55       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 13 17:16:55       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to