Hi, we are attempting to migrate our flink cluster to K8, and are looking into options how to automate job upgrades; wondering if anyone here has done it with init container? Or if there is a simpler way?
0: So, let's assume we have a job manager with few task managers running, in a stateful set; managed with helm. 1: New helm chart is published, and helm attempts the upgrade. Since it's a stateful set, new version of job manager and taskmanager is started even while old one is still running. 2: In the job manager pod, there is an init container, whose purpose it to find currently running job manager with previous version of JOB ( either via zookeeper or kubernetes service which points to currently running job manager). After it finds it, it runs cancel with savepoint using flink CLI, and passes the savepoint URL via volume to main container. 3: job manager container starts, it finds the savepoint, and restores the new version of job, with the state from savepoint. 4: new pods are passing healthchecks, so old pods are destroyed by kubernetes. What happens if there is no previous job manager running? init container sees that, and just exits without any other work. Caveat: Most of solutions I noticed were using operators, which feel quite a bit more complex, yet since I haven't found any solution using init container, I'm guessing I'm missing something, just can't figure out what? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/