[ https://issues.apache.org/jira/browse/FLINK-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410707#comment-17410707 ]
Yun Gao commented on FLINK-24162: --------------------------------- I should roughly get the reason: it is due to that the test verifies for each operator, it should only received max watermark and EndOfData once for all the attempts. However, this is in fact not ensured: if a task get finished first, then get restarted due to some downstream tasks failed, it would still re-emit max watermark and EndOfData. This is required for the downstream tasks to align. This only happen with adaptive scheduler since now the test is in fact using region failover, with the default scheduler the finish task would not restart in failover. Sorry I do not fully got why with adaptive scheduler the whole job is restarted, but from the log this should indeed happen. I'll open a PR to fix the issue~ > PartiallyFinishedSourcesITCase fails due to assertion error in > DrainingValidator.validateOperatorLifecycle > ---------------------------------------------------------------------------------------------------------- > > Key: FLINK-24162 > URL: https://issues.apache.org/jira/browse/FLINK-24162 > Project: Flink > Issue Type: Bug > Components: API / DataStream > Affects Versions: 1.14.0, 1.15.0 > Reporter: Xintong Song > Priority: Blocker > Labels: test-stability > Fix For: 1.14.0, 1.15.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23526&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=4639 > {code} > Sep 03 23:17:11 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 19.233 s <<< FAILURE! - in > org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase > Sep 03 23:17:11 [ERROR] test[simple graph SINGLE_SUBTASK, failover: true] > Time elapsed: 2.27 s <<< FAILURE! > Sep 03 23:17:11 java.lang.AssertionError > Sep 03 23:17:11 at org.junit.Assert.fail(Assert.java:87) > Sep 03 23:17:11 at org.junit.Assert.assertTrue(Assert.java:42) > Sep 03 23:17:11 at org.junit.Assert.assertFalse(Assert.java:65) > Sep 03 23:17:11 at org.junit.Assert.assertFalse(Assert.java:75) > Sep 03 23:17:11 at > org.apache.flink.runtime.operators.lifecycle.validation.DrainingValidator.validateOperatorLifecycle(DrainingValidator.java:56) > Sep 03 23:17:11 at > org.apache.flink.runtime.operators.lifecycle.validation.TestOperatorLifecycleValidator.lambda$checkOperatorsLifecycle$1(TestOperatorLifecycleValidator.java:52) > Sep 03 23:17:11 at java.util.HashMap.forEach(HashMap.java:1289) > Sep 03 23:17:11 at > org.apache.flink.runtime.operators.lifecycle.validation.TestOperatorLifecycleValidator.checkOperatorsLifecycle(TestOperatorLifecycleValidator.java:47) > Sep 03 23:17:11 at > org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:94) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)