[ 
https://issues.apache.org/jira/browse/FLINK-18983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YufeiLiu updated FLINK-18983:
-----------------------------
    Description: 
If a operator throw a exception, it will break process loop and dispose all 
operator. But state will never switch to FAILED if block in Function.close, and 
JobMaster can't know the final state and do restart.

Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, but 
it doesn't work for FAILED task.

Can we just report final state and trigger clean up action by JM.

  was:
If a operator throw a exception, it will break process loop and dispose all 
operator. But state will never switch to FAILED if block in Function.close, and 
JobMaster can't know the final state and do restart.

Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, but 
it doesn't work for FAILED task.


> Job doesn't changed to failed if close function has blocked
> -----------------------------------------------------------
>
>                 Key: FLINK-18983
>                 URL: https://issues.apache.org/jira/browse/FLINK-18983
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.11.0
>            Reporter: YufeiLiu
>            Priority: Major
>
> If a operator throw a exception, it will break process loop and dispose all 
> operator. But state will never switch to FAILED if block in Function.close, 
> and JobMaster can't know the final state and do restart.
> Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, 
> but it doesn't work for FAILED task.
> Can we just report final state and trigger clean up action by JM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to