[ 
https://issues.apache.org/jira/browse/FLINK-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410725#comment-17410725
 ] 

Roman Khachatryan commented on FLINK-24162:
-------------------------------------------

Thanks for looking into it [~gaoyunhaii] .

I can confirm that the task transitions to FINISHED twice: before and after a 
failure:
{code:java}
23:16:57,760 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: Custom 
Source -> Timestamps/Watermarks -> transform-1-forward -> Sink: Unnamed (1/4)#0 
(4a3e2cd18c9e79b42dc8d6624fcbcde8) switched from RUNNING to FINISHED.
...
23:16:57,837 [    Checkpoint Timer] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Triggering 
checkpoint 3 (type=CHECKPOI NT) @ 1630711017835 for job 
3d9486075a07c60f7d6927cff31ab0db.
23:16:57,840 [jobmanager-io-thread-18] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Completed 
checkpoint 3 for job 3d94 86075a07c60f7d6927cff31ab0db (0 bytes, 
checkpointDuration=5 ms, finalizationTime=0 ms).
23:16:57,849 [Source: Custom Source -> Timestamps/Watermarks -> 
transform-1-forward -> Sink: Unnamed (3/4)#0] WARN  
org.apache.flink.runtime.taskm anager.Task                    [] - Source: 
Custom Source -> Timestamps/Watermarks -> transform-1-forward -> Sink: Unnamed 
(3/4)#0 (f8498b498d21de 0ce1edd1175a20e5a6) switched from RUNNING to FAILED 
with failure cause: java.lang.RuntimeException: requested to fail
     at 
org.apache.flink.runtime.operators.lifecycle.graph.TestEventSource.run(TestEventSource.java:82)
     at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:116)
     at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:73)
     at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:323)
...
23:16:58,357 INFO  org.apache.flink.runtime.taskmanager.Task [] - Source: 
Custom Source -> Timestamps/Watermarks -> transform-1-forward -> Sink: Unnamed 
(1/4)#1 (4c131c07267e65d0365a4f2db71f41dc) switched from RUNNING to FINISHED.
{code}
There is a checkpoint (3) that is completed after finishing and is used for 
recovery.

You're right that the whole job is restarted. However, shouldn't it be always 
the case because?

TestJobBuilders#prepareEnv sets:
{code:java}
configuration.set(EXECUTION_FAILOVER_STRATEGY, "full"); {code}
 

 

 

> PartiallyFinishedSourcesITCase fails due to assertion error in 
> DrainingValidator.validateOperatorLifecycle
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-24162
>                 URL: https://issues.apache.org/jira/browse/FLINK-24162
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.14.0, 1.15.0
>            Reporter: Xintong Song
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.14.0, 1.15.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23526&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=4639
> {code}
> Sep 03 23:17:11 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 19.233 s <<< FAILURE! - in 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase
> Sep 03 23:17:11 [ERROR] test[simple graph SINGLE_SUBTASK, failover: true]  
> Time elapsed: 2.27 s  <<< FAILURE!
> Sep 03 23:17:11 java.lang.AssertionError
> Sep 03 23:17:11       at org.junit.Assert.fail(Assert.java:87)
> Sep 03 23:17:11       at org.junit.Assert.assertTrue(Assert.java:42)
> Sep 03 23:17:11       at org.junit.Assert.assertFalse(Assert.java:65)
> Sep 03 23:17:11       at org.junit.Assert.assertFalse(Assert.java:75)
> Sep 03 23:17:11       at 
> org.apache.flink.runtime.operators.lifecycle.validation.DrainingValidator.validateOperatorLifecycle(DrainingValidator.java:56)
> Sep 03 23:17:11       at 
> org.apache.flink.runtime.operators.lifecycle.validation.TestOperatorLifecycleValidator.lambda$checkOperatorsLifecycle$1(TestOperatorLifecycleValidator.java:52)
> Sep 03 23:17:11       at java.util.HashMap.forEach(HashMap.java:1289)
> Sep 03 23:17:11       at 
> org.apache.flink.runtime.operators.lifecycle.validation.TestOperatorLifecycleValidator.checkOperatorsLifecycle(TestOperatorLifecycleValidator.java:47)
> Sep 03 23:17:11       at 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:94)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to