Thank you Michal and Mate

Related to the same topic , I see that new feature of FlinkStateSnapshots
CR quite handy by managing the snapshots in kubernetes workflow , and was
doing some experiments around that.

In the 1.9 version of operator ,we could see the history of snapshots in CR
status , for example for savepoint history in savepointInfo
savepointHistory[] .  With 1.10 version as per documentation, savepointInfo
fields in Flink resource CR are deprecated as by default
kubernetes.operator.snapshot.resource.enabled is enabled,  then how do we
get savepointHistory information.

Regards
Lajith

On Wed, Dec 18, 2024 at 11:52 PM Mate Czagany <czmat...@gmail.com> wrote:

> Hi Michal,
>
> I am happy that you have found this new feature interesting, I hope you
> will find it useful if you plan to use it.
>
> 1. I am not sure when or how the deprecated fields will be removed, it
> should happen when the community is satisfied with the
> new FlinkStateSnapshots CRDs. For checkpoints, the path is not part of the
> response when we retrieve the checkpoint status [1]. So the path is
> retrieved after the checkpoint is marked as completed with a different
> request [2], but the checkpoint history is limited
> (web.checkpoints.history), so it's possible that by the time the operator
> tries to download the path of the checkpoint, the cache has already dropped
> that checkpoint ID. In these cases, the operator will still mark the
> checkpoint as COMPLETED, but the path will be left blank. [3]
>
> 2. One of the main benefits is that this approach allows users to create
> and manage their snapshots in the Kubernetes workflow that they are already
> familiar with. While having these in the FlinkDeployment/FlinkSessionJob CR
> is handy for taking a quick snapshot, others would like to have a more
> sophisticated way to manage them. This new approach makes it easier to
> manage and list snapshots of any or all of their deployments.
>
> 3. If you have periodic checkpointing enabled, snapshot.resource.enabled is
> set to true, and the FlinkStateSnapshot CRD is installed on your Kubernetes
> cluster, the FlinkDeployment/FlinkSessionJob reconciler should create new
> FlinkStateSnapshot resources based on that interval with empty status.
> Then StateSnapshotReconciler will be notified of this new resource, and
> will trigger the checkpoint and update the status fields when it's
> finished.
>
> If you have any other questions, please let me know. I hope that you will
> be able to find this feature useful.
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/rest_api/#jobs-jobid-checkpoints-triggerid
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/rest_api/#jobs-jobid-checkpoints-details-checkpointid
> [3]
>
> https://github.com/apache/flink-kubernetes-operator/blob/091e803a6ae713ebe839742694ab6ca53249c4dd/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/observer/snapshot/StateSnapshotObserver.java#L151
>
> Michas Szacillo (BLOOMBERG/ 919 3RD A) <mszaci...@bloomberg.net> ezt írta
> (időpont: 2024. dec. 18., Sze, 0:11):
>
> > Hi all! I recently came across the FlinkStateSnapshot feature which I
> > found quite interesting, but I had a couple questions on its use.
> >
> > 1. In favor of FlinkStateSnapshots, I see that both the checkpointInfo
> and
> > savepointInfo have been deprecated as part of the JobStatus of
> > FlinkDeployment. Does this mean these fields will eventually be
> completely
> > removed? Additionally, would the community consider adding a path field
> to
> > the existing checkpointInfo? Currently it only shows the triggerId, which
> > isn't as helpful when trying to find the actual checkpoint path.
> >
> > 2. For FlinkStateSnapshots, was is the major benefit of separating out
> the
> > checkpoint and savepoint info outside of the FlinkDeployment status? I
> can
> > see the benefit of having some separation, but I feel like I may be
> missing
> > additional context.
> >
> > 3. Do FlinkStateSnapshots trigger checkpoints by default if periodic
> > checkpointing is enabled? I was using Flink v1_17 and recently tried
> > setting the kubernetes.operator.periodic.checkpoint.interval and although
> > Flink Operator does trigger periodic checkpoints, I did not see the
> > FlinkStateSnapshot status reconcile. I had assumed this status would be
> > populated automatically if the snapshot was configured like in the
> > documentation:
> >
> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/snapshots/#checkpoint
> .
> >
> >
> > Appreciate the help!
> >
> > - Michal
>

Reply via email to