+1 in general

What is the default in batch, though? No restarts? I always found that
somewhat uncommon.
Should we also change that part, if we are changing the default anyways?


On Fri, Aug 30, 2019 at 2:35 PM Till Rohrmann <trohrm...@apache.org> wrote:

> Hi everyone,
>
> I wanted to discuss how to simplify Flink's cluster level RestartStrategy
> configuration [1]. Currently, Flink's behaviour with respect to configuring
> the {{RestartStrategies}} is quite complicated and convoluted. The reason
> for this is that we evolved the way it has been configured and wanted to
> keep it backwards compatible. Due to this, we have currently the following
> behaviour:
>
> * If the config option `restart-strategy` is configured, then Flink uses
> this `RestartStrategy` (so far so simple)
> * If the config option `restart-strategy` is not configured, then
> ** If `restart-strategy.fixed-delay.attempts` or
> `restart-strategy.fixed-delay.delay` are defined, then instantiate
> `FixedDelayRestartStrategy(restart-strategy.fixed-delay.attempts,
> restart-strategy.fixed-delay.delay)`
> ** If `restart-strategy.fixed-delay.attempts` and
> `restart-strategy.fixed-delay.delay` are not defined, then
> *** If checkpointing is disabled, then choose `NoRestartStrategy`
> *** If checkpointing is enabled, then choose
> `FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")`
>
> I would like to simplify the configuration by removing the "If
> `restart-strategy.fixed-delay.attempts` or
> `restart-strategy.fixed-delay.delay`, then" condition. That way, the logic
> would be the following:
>
> * If the config option `restart-strategy` is configured, then Flink uses
> this `RestartStrategy`
> * If the config option `restart-strategy` is not configured, then
> ** If checkpointing is disabled, then choose `NoRestartStrategy`
> ** If checkpointing is enabled, then choose
> `FixedDelayRestartStrategy(Integer.MAX_VALUE, "0 s")`
>
> That way we retain the user friendliness that jobs restart if the user
> enabled checkpointing and we make it clear that any `
> restart-strategy.fixed-delay.xyz` setting will only be respected if
> `restart-strategy` has been set to `fixed-delay`.
>
> This simplification would, however, change Flink's behaviour and might
> break existing setups. Since we introduced `RestartStrategies` with Flink
> 1.0.0 and deprecated the prior configuration mechanism which enables
> restarting if either the `attempts` or the `delay` has been set, I think
> that the number of broken jobs should be minimal if not non-existent.
>
> I'm sure that one can simplify the way RestartStrategies are
> programmatically configured as well but for the sake of simplicity/scoping
> I'd like to not touch it right away.
>
> What do you think about this behaviour change?
>
> [1] https://issues.apache.org/jira/browse/FLINK-13921
>
> Cheers,
> Till
>

Reply via email to