[ 
https://issues.apache.org/jira/browse/FLINK-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019866#comment-17019866
 ] 

Guowei Ma commented on FLINK-2646:
----------------------------------

I want to understand what specific scenarios the jira wants to resolve. After 
reading some mail/doc[1][2] and some scenarios I encountered. I summary it as 
below(correct me if I miss something):

Currently, the scenarios that _close_ interface could not be satisfied
 # The user wants to accelerate the failover process. Currently, some users 
implement the _close_ interface to flush the memory data to the external system 
because the job would deal with the bounded stream sometimes. However, it slows 
down the failover process because when canceling the task the Flink would also 
call the close method which might do some heavy i/o processing.
 # The user wants the exactly once semantics for the bounded stream. If the 
user implements the _close_ interface which commits the results some results 
would be committed multi-times because when failover occurs some messages would 
be replayed. If the user implements the _close_ interface which does not commit 
the result some results would be lost.

Because many users implement the _close_ interface to release the resources so 
we could not break this semantics that whenever a task is terminated the 
_close_ method should always be called.

If Flink provides an interface such as `_closeAtEndofStream_` I think we could 
resolve the second problem in most situations. However I think this also needs 
some other efforts such as dedupe the commit at the _close_ or using the 
_finalizeOnMaster_ callback.

 

[1] 
https://lists.apache.org/thread.html/4cf28a9fa3732dfdd9e673da6233c5288ca80b20d58cee130bf1c141%40%3Cuser.flink.apache.org%3E

[2] 
https://docs.google.com/document/d/1SXfhmeiJfWqi2ITYgCgAoSDUv5PNq1T8Zu01nR5Ebog/edit#

> User functions should be able to differentiate between successful close and 
> erroneous close
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-2646
>                 URL: https://issues.apache.org/jira/browse/FLINK-2646
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataStream
>    Affects Versions: 0.10.0
>            Reporter: Stephan Ewen
>            Assignee: Kostas Kloudas
>            Priority: Major
>              Labels: usability
>
> Right now, the {{close()}} method of rich functions is invoked in case of 
> proper completion, and in case of canceling in case of error (to allow for 
> cleanup).
> In certain cases, the user function needs to know why it is closed, whether 
> the task completed in a regular fashion, or was canceled/failed.
> I suggest to add a method {{closeAfterFailure()}} to the {{RichFunction}}. By 
> default, this method calls {{close()}}. The runtime is the changed to call 
> {{close()}} as part of the regular execution and {{closeAfterFailure()}} in 
> case of an irregular exit.
> Because by default all cases call {{close()}} the change would not be API 
> breaking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to