Hi Matthias, What you mention is a little tricky. When we create a new cluster it will have its own volume (PVC) so sending savepoint/checkpoint data from volume (PVC) of the older cluster to the newer cluster is a manual task. Also not sure if savepoint/checkpoint data needs to be copied to the newer flink cluster before flink starts. This approach is more like a blue/green upgrade strategy.
I wanted to understand if Flink supports rollingUpdate where we update Taskmanager and Jobmanager pods one by one and its impact when during upgrade Jobmanagers & Taskmanger pods are on different versions. Also the impact of recreate strategy in the same context. Regards, Amit On Fri, Aug 27, 2021 at 5:32 PM Matthias Pohl <matth...@ververica.com> wrote: > The upgrade approach mentioned in my previous answer should also work in > the context of k8s and pods: Creating a Flink cluster having the newer > version should be done before migrating the job using a savepoint. But > maybe, I misunderstand your question. Do you have something in mind where > you upgrade each pod individually, i.e. operating TaskManagers and > JobManagers with different Flink versions in the same Flink cluster? > > Best, > Matthias > > On Fri, Aug 27, 2021 at 11:05 AM Amit Bhatia <bhatia.amit1...@gmail.com> > wrote: > >> Hi Matthias, >> >> Thanks for the information but this upgrade is looking like on native >> (physical/virtual) deployment. >> I want to understand the upgrade strategies on kubernetes deployments >> where Flink is running in pods. If you could help in that area it would be >> great. >> >> Regards, >> Amit Bhatia >> >> On Thu, Aug 26, 2021 at 5:25 PM Matthias Pohl <matth...@ververica.com> >> wrote: >> >>> Hi Amit, >>> upgrading Flink versions means that you should stop your jobs with a >>> savepoint first. A new cluster with the new Flink version can be deployed >>> next. Then, this cluster can be used to start the jobs from the previously >>> created savepoints. Each job should pick up the work from where it stopped. >>> See [1] for further details on how to upgrade Flink. >>> I'm not sure about any Helm-specifics here. But I'm gonna pull Austin >>> into the thread. He might have more insights to share. >>> >>> Best, >>> Matthias >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#upgrading-the-flink-framework-version >>> >>> On Thu, Aug 26, 2021 at 9:10 AM Amit Bhatia <bhatia.amit1...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> We are using Flink 1.13.2 with Kubernetes HA solution provided by >>>> flink. We have created a deployment for JobManager and TaskManager with >>>> option to deploy multiple replicas and the same is bundled in a single helm >>>> chart. >>>> So we have below queries regarding Flink upgrade strategies, kindly >>>> help us to answer below queries: >>>> >>>> 1) What upgrade strategies are supported by Flink >>>> (RollingUpdate/Recreate) and which one is recommended for production use? >>>> >>>> 2) During Flink upgrade from version A to version B, if we are using >>>> rollingUpdate then at some point of time multiple versions of Flink JMs & >>>> TMs might be running so does that can cause any corruption/failure for >>>> running Jobs ? >>>> >>>> 3) During Flink upgrade from version A to version B, If we use recreate >>>> then at some point of time if all JMs gets updated to a new version and TMs >>>> are still updating which means TMs are running with different versions then >>>> will this cause any corruption/failure for running Jobs? >>>> >>>> Regards, >>>> Amit Bhatia >>>> >>>