zhuzhurk commented on a change in pull request #9113: [FLINK-13222] [runtime] Add documentation for failover strategy option URL: https://github.com/apache/flink/pull/9113#discussion_r305760957
########## File path: docs/dev/task_failure_recovery.md ########## @@ -264,4 +267,54 @@ The cluster defined restart strategy is used. This is helpful for streaming programs which enable checkpointing. By default, a fixed delay restart strategy is chosen if there is no other restart strategy defined. +## Failover Strategies + +Flink supports different failover strategies which can be configured via the configuration parameter +*jobmanager.execution.failover-strategy* in Flink's configuration file `flink-conf.yaml`. + +<table class="table table-bordered"> + <thead> + <tr> + <th class="text-left" style="width: 50%">Failover Strategy</th> + <th class="text-left">Value for jobmanager.execution.failover-strategy</th> + </tr> + </thead> + <tbody> + <tr> + <td>Restart all</td> + <td>full</td> + </tr> + <tr> + <td>Restart pipelined region</td> + <td>region</td> + </tr> + </tbody> +</table> + +### Restart All Failover Strategy + +This strategy restarts all tasks in the job to recover from a task failure. + +### Restart Pipelined Region Failover Strategy + +This strategy groups tasks into disjoint regions. When a task failure is detected, +this strategy computes the smallest set of regions that must be restarted to recover from the failure. Review comment: I think here we mean `one set` of regions to restart for a task failure. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services