one thing you could consider is a mutator that detects when a failover is
happening, and then updates the CR to point to the right snapshot to
restore from.
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.12/docs/operations/plugins/#custom-flink-resource-mutators
On Th
hello
that would indeed work, but it requires knowing in advance the last jobID for
that particular job and changing the spec submitted to the destination cluster.
we aim to have 0 touch job failover from k8s cluster to k8s cluster
our cluster are multi node, multi az but they run critical busi
Got it.
I'd say what you want to achieve is something like having a
"latest-savepoint" symlink always pointing to the latest written savepoint
file, and always starting from that.
To achieve this, you'd require some manual work. IIRC, you cannot set a
jobID via the operator.
Att,
Pedro Mázala
Using Flink k8s operator, you may use the yaml property
job.initialSavepointPath to set a path that you want to start your pipeline
from. This would be the full path. Including the jobid. And then, you'll
have the new ID generated and such.
To avoid maintenance issues like this one, a multi-node c
Hello
I run flink (v 1.20) on k8s using the native integration and the k8s operator
(v 1.30), we keep savepoints and checkpoints in S3.
We'd like to be able to continue running the same jobs (with the same config,
same image, using the same sink and sources, connecting to kafka using the same