Hi Michal, I am happy that you have found this new feature interesting, I hope you will find it useful if you plan to use it.
1. I am not sure when or how the deprecated fields will be removed, it should happen when the community is satisfied with the new FlinkStateSnapshots CRDs. For checkpoints, the path is not part of the response when we retrieve the checkpoint status [1]. So the path is retrieved after the checkpoint is marked as completed with a different request [2], but the checkpoint history is limited (web.checkpoints.history), so it's possible that by the time the operator tries to download the path of the checkpoint, the cache has already dropped that checkpoint ID. In these cases, the operator will still mark the checkpoint as COMPLETED, but the path will be left blank. [3] 2. One of the main benefits is that this approach allows users to create and manage their snapshots in the Kubernetes workflow that they are already familiar with. While having these in the FlinkDeployment/FlinkSessionJob CR is handy for taking a quick snapshot, others would like to have a more sophisticated way to manage them. This new approach makes it easier to manage and list snapshots of any or all of their deployments. 3. If you have periodic checkpointing enabled, snapshot.resource.enabled is set to true, and the FlinkStateSnapshot CRD is installed on your Kubernetes cluster, the FlinkDeployment/FlinkSessionJob reconciler should create new FlinkStateSnapshot resources based on that interval with empty status. Then StateSnapshotReconciler will be notified of this new resource, and will trigger the checkpoint and update the status fields when it's finished. If you have any other questions, please let me know. I hope that you will be able to find this feature useful. [1] https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/rest_api/#jobs-jobid-checkpoints-triggerid [2] https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/rest_api/#jobs-jobid-checkpoints-details-checkpointid [3] https://github.com/apache/flink-kubernetes-operator/blob/091e803a6ae713ebe839742694ab6ca53249c4dd/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/observer/snapshot/StateSnapshotObserver.java#L151 Michas Szacillo (BLOOMBERG/ 919 3RD A) <mszaci...@bloomberg.net> ezt írta (időpont: 2024. dec. 18., Sze, 0:11): > Hi all! I recently came across the FlinkStateSnapshot feature which I > found quite interesting, but I had a couple questions on its use. > > 1. In favor of FlinkStateSnapshots, I see that both the checkpointInfo and > savepointInfo have been deprecated as part of the JobStatus of > FlinkDeployment. Does this mean these fields will eventually be completely > removed? Additionally, would the community consider adding a path field to > the existing checkpointInfo? Currently it only shows the triggerId, which > isn't as helpful when trying to find the actual checkpoint path. > > 2. For FlinkStateSnapshots, was is the major benefit of separating out the > checkpoint and savepoint info outside of the FlinkDeployment status? I can > see the benefit of having some separation, but I feel like I may be missing > additional context. > > 3. Do FlinkStateSnapshots trigger checkpoints by default if periodic > checkpointing is enabled? I was using Flink v1_17 and recently tried > setting the kubernetes.operator.periodic.checkpoint.interval and although > Flink Operator does trigger periodic checkpoints, I did not see the > FlinkStateSnapshot status reconcile. I had assumed this status would be > populated automatically if the snapshot was configured like in the > documentation: > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/snapshots/#checkpoint. > > > Appreciate the help! > > - Michal