[ https://issues.apache.org/jira/browse/FLINK-10298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
JIN SUN updated FLINK-10298: ---------------------------- Fix Version/s: 1.7.0 > Batch Job Failover Strategy > --------------------------- > > Key: FLINK-10298 > URL: https://issues.apache.org/jira/browse/FLINK-10298 > Project: Flink > Issue Type: Sub-task > Components: JobManager > Reporter: JIN SUN > Assignee: JIN SUN > Priority: Major > Fix For: 1.7.0 > > > The new failover strategy needs to consider handling failures according to > different failure types. It orchestrates all the logics we mentioned in this > [document|https://docs.google.com/document/d/1FdZdcA63tPUEewcCimTFy9Iz2jlVlMRANZkO4RngIuk/edit], > we can put the logic in onTaskFailure method of the FailoverStrategy > interface, with the logic inline: > {code:java} > public void onTaskFailure(Execution taskExecution, Throwable cause) { > //1. Get the throwable type > //2. If the type is NonrecoverableType fail the job > //3. If the type is PatritionDataMissingError, do revocation > //4. If the type is EnvironmentError, do check blacklist > //5. Other failure types are recoverable, but we need to remember the > count of the failure, > //6. if it exceeds the threshold, fail the job > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)