Hi experts, I am running flink 1.10.2 on kubernetes as per job cluster. Checkpoint is enabled, with interval 3s, minimumPause 1s, timeout 10s. I'm using FsStateBackend, snapshots are persisted to azure blob storage (Microsoft cloud storage service).
Checkpointed state is just source kafka topic offsets, the flink job is stateless as it does filter/json transformation. The way I am trying to stop the flink job is via monitoring rest api mentioned in doc <https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jobs-jobid-1> e.g. curl -X PATCH \ 'http://localhost:8081/jobs/3c00535c182a3a00258e2f57bc11fb1a?mode=cancel' \ -H 'Content-Type: application/json' \ -d '{}' This call returned successfully with statusCode 202, then I stopped the task manager pods and job manager pod. According to the doc, the checkpoint should be cleaned up after the job is stopped/cancelled. What I have observed is, the checkpoint dir is not cleaned up, can you please shield some lights on what I did wrong? Below shows the checkpoint dir for a cancelled flink job. [image: image.png] Thanks! Eleanore