[ https://issues.apache.org/jira/browse/FLINK-10298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607460#comment-16607460 ]
陈梓立 edited comment on FLINK-10298 at 9/7/18 6:13 PM: ----------------------------------------------------- Hi [~isunjin], Thanks for raise this JIRA! After read the documentation, it seems that the main issue this design concerned is downstream recover from upstream missing or data consumption exceptions. Thus I wonder if you have read [FILNK-6227] which introduce the DataConsumptionException for downstream task failure that would resolve this case? was (Author: tison): Hi [~isunjin], Thanks for raise this JIRA! After read the documentation, it seems that the main issue this design concerned is downstream recover from upstream DataConsumptionException. Thus I wonder if you have read [FILNK-6227] which introduce the DataConsumptionException for downstream task failure that would resolve this case? > Batch Job Failover Strategy > --------------------------- > > Key: FLINK-10298 > URL: https://issues.apache.org/jira/browse/FLINK-10298 > Project: Flink > Issue Type: Sub-task > Components: JobManager > Reporter: JIN SUN > Assignee: JIN SUN > Priority: Major > > The new failover strategy needs to consider handling failures according to > different failure types. It orchestrates all the logics we mentioned in this > [document|https://docs.google.com/document/d/1FdZdcA63tPUEewcCimTFy9Iz2jlVlMRANZkO4RngIuk/edit], > we can put the logic in onTaskFailure method of the FailoverStrategy > interface, with the logic inline: > {code:java} > public void onTaskFailure(Execution taskExecution, Throwable cause) { > //1. Get the throwable type > //2. If the type is NonrecoverableType fail the job > //3. If the type is PatritionDataMissingError, do revocation > //4. If the type is EnvironmentError, do check blacklist > //5. Other failure types are recoverable, but we need to remember the > count of the failure, > //6. if it exceeds the threshold, fail the job > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)