Thanks for clarifying that, Amit. Rolling updates with JobManagers and TaskManagers coming from different Flink versions in the same Flink cluster is not supported.
@Yang Wang <danrtsey...@gmail.com> Do you have any recommendations you could share in this regard? Best, Matthias On Fri, Aug 27, 2021 at 2:44 PM Amit Bhatia <bhatia.amit1...@gmail.com> wrote: > Hi Matthias, > > What you mention is a little tricky. When we create a new cluster it will > have its own volume (PVC) so sending savepoint/checkpoint data from volume > (PVC) of the older cluster to the newer cluster is a manual task. Also not > sure if savepoint/checkpoint data needs to be copied to the newer flink > cluster before flink starts. This approach is more like a blue/green > upgrade strategy. > > I wanted to understand if Flink supports rollingUpdate where we update > Taskmanager and Jobmanager pods one by one and its impact when during > upgrade Jobmanagers & Taskmanger pods are on different versions. Also the > impact of recreate strategy in the same context. > > Regards, > Amit > > On Fri, Aug 27, 2021 at 5:32 PM Matthias Pohl <matth...@ververica.com> > wrote: > >> The upgrade approach mentioned in my previous answer should also work in >> the context of k8s and pods: Creating a Flink cluster having the newer >> version should be done before migrating the job using a savepoint. But >> maybe, I misunderstand your question. Do you have something in mind where >> you upgrade each pod individually, i.e. operating TaskManagers and >> JobManagers with different Flink versions in the same Flink cluster? >> >> Best, >> Matthias >> >> On Fri, Aug 27, 2021 at 11:05 AM Amit Bhatia <bhatia.amit1...@gmail.com> >> wrote: >> >>> Hi Matthias, >>> >>> Thanks for the information but this upgrade is looking like on native >>> (physical/virtual) deployment. >>> I want to understand the upgrade strategies on kubernetes deployments >>> where Flink is running in pods. If you could help in that area it would be >>> great. >>> >>> Regards, >>> Amit Bhatia >>> >>> On Thu, Aug 26, 2021 at 5:25 PM Matthias Pohl <matth...@ververica.com> >>> wrote: >>> >>>> Hi Amit, >>>> upgrading Flink versions means that you should stop your jobs with a >>>> savepoint first. A new cluster with the new Flink version can be deployed >>>> next. Then, this cluster can be used to start the jobs from the previously >>>> created savepoints. Each job should pick up the work from where it stopped. >>>> See [1] for further details on how to upgrade Flink. >>>> I'm not sure about any Helm-specifics here. But I'm gonna pull Austin >>>> into the thread. He might have more insights to share. >>>> >>>> Best, >>>> Matthias >>>> >>>> [1] >>>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/upgrading/#upgrading-the-flink-framework-version >>>> >>>> On Thu, Aug 26, 2021 at 9:10 AM Amit Bhatia <bhatia.amit1...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> We are using Flink 1.13.2 with Kubernetes HA solution provided by >>>>> flink. We have created a deployment for JobManager and TaskManager with >>>>> option to deploy multiple replicas and the same is bundled in a single >>>>> helm >>>>> chart. >>>>> So we have below queries regarding Flink upgrade strategies, kindly >>>>> help us to answer below queries: >>>>> >>>>> 1) What upgrade strategies are supported by Flink >>>>> (RollingUpdate/Recreate) and which one is recommended for production use? >>>>> >>>>> 2) During Flink upgrade from version A to version B, if we are using >>>>> rollingUpdate then at some point of time multiple versions of Flink JMs & >>>>> TMs might be running so does that can cause any corruption/failure for >>>>> running Jobs ? >>>>> >>>>> 3) During Flink upgrade from version A to version B, If we use >>>>> recreate then at some point of time if all JMs gets updated to a new >>>>> version and TMs are still updating which means TMs are running with >>>>> different versions then will this cause any corruption/failure for running >>>>> Jobs? >>>>> >>>>> Regards, >>>>> Amit Bhatia >>>>> >>>>