Hi, Robin This is a good question, but Flink can't do rolling upgrades.
I'll try to explain the cost of Flink's support for RollingUpgrade. 1. There is a shuffle connection between tasks in a Region, and in order to ensure the consistency of the data processed by the upstream and downstream tasks in the same region, these tasks need to Failover/Restart at the same time. So the cost of RollingUpgrade for jobs with Keyby/Hash shuffle is not much different from the cost of stop/start jobs. 2. In order to maintain the accuracy of RollingUpgrade, many details need to be considered, such as checkpoint coordinator, failover handling, and operator coordinator. But, IMO, Job RollingUpgrade is a worthwhile thing to do. Best, Weihua On Wed, Jun 29, 2022 at 3:52 PM Robin Cassan <robin.cas...@contentsquare.com> wrote: > Hi all! We are running a flink cluster on kubernetes and deploying a > single job on it through "flink run <jar>". Whenever we want to modify the > jar, we cancel the job and run the "flink run" command again, with the new > jar, and the retained checkpoint URL from the first run. > This works well, but this adds some unavoidable downtime for each update: > - Downtime in-between the "cancel" and the "run" > - Downtime during restoration of the state > > We aim at reducing this downtime to a minimum and are wondering if there > is a way to submit a new version of a job without completely stopping the > old one, having each TM running the new jar one at a time and waiting for > the state to be restored before moving on to the next TM? This would limit > the downtime to only a single TM at a time > > Otherwise, we could try to submit a second job and let it restore before > canceling the old one, but this raises complex synchronisation issues, > especially since the second job will be restoring the first job's retained > (incremental) checkpoint, which seems like a bad idea... > > What would be the best way to reduce downtime in this scenario, in your > opinion? > > Thanks! >