Hi Dongwon, Happy new year! One meta file would be stored on top of HDFS even if external-checkpoint is not enabled. If external checkpoint is not enabled, flink would delete all the checkpoints on exit, and if external checkpoint is enabled, the checkpoints would be kept on cancel or fail cases, according to the settings. Thus for the second issue, I think it would be yes.
Best, Yun ------------------Original Mail ------------------ Sender:Dongwon Kim <eastcirc...@gmail.com> Send Date:Mon Jan 4 19:16:39 2021 Recipients:user <user@flink.apache.org> Subject:Is chk-$id/_metadata created regardless of enabling externalized checkpoints? Hi, First of all, happy new year! It can be a very basic question but I have something to clarify in my head. my flink-conf.yaml is as follows (note that I didn't specify the value of "execution-checkpointing-externalized-checkpoint-retention [1]"): #... execution.checkpointing.interval: 20min execution.checkpointing.min-pause: 1min state.backend: rocksdb state.backend.incremental: true state.checkpoints.dir: hdfs:///flink-jobs/ckpts state.checkpoints.num-retained: 10 state.savepoints.dir: hdfs:///flink-jobs/svpts #... And the checkpoint configuration is shown as follows in Web UI (note that "Persist Checkpoints Externally" is "Disabled" in the final row): According to [2], externalized checkpoints: You can configure periodic checkpoints to be persisted externally. Externalized checkpoints write their meta data out to persistent storage and are not automatically cleaned up when the job fails. This way, you will have a checkpoint around to resume from if your job fails. There are more details in the deployment notes on externalized checkpoints. So I've thought the metadata of a checkpoint is only on JobManager's memory and not stored on HDFS unless "execution-checkpointing-externalized-checkpoint-retention" is set. However, even without setting the value, every checkpoint already contains its own metadata: [user@devflink conf]$ hdfs dfs -ls /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/* Found 1 items -rw-r--r-- 3 user hdfs 163281 2021-01-04 14:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-945/_metadata Found 1 items -rw-r--r-- 3 user hdfs 163281 2021-01-04 14:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-946/_metadata Found 1 items -rw-r--r-- 3 user hdfs 163157 2021-01-04 15:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-947/_metadata Found 1 items -rw-r--r-- 3 user hdfs 156684 2021-01-04 15:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-948/_metadata Found 1 items -rw-r--r-- 3 user hdfs 147280 2021-01-04 15:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-949/_metadata Found 1 items -rw-r--r-- 3 user hdfs 147280 2021-01-04 16:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-950/_metadata Found 1 items -rw-r--r-- 3 user hdfs 162937 2021-01-04 16:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-951/_metadata Found 1 items -rw-r--r-- 3 user hdfs 175089 2021-01-04 16:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-952/_metadata Found 1 items -rw-r--r-- 3 user hdfs 173289 2021-01-04 17:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-953/_metadata Found 1 items -rw-r--r-- 3 user hdfs 153951 2021-01-04 17:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-954/_metadata Found 21 items -rw-r--r-- 3 user hdfs 78748 2021-01-04 14:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/05d76f4e-3d9c-420c-8b87-077fc9880d9a -rw-r--r-- 3 user hdfs 23905 2021-01-04 15:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/0b9d9323-9f10-4fc2-8fcc-a9326448b07c -rw-r--r-- 3 user hdfs 81082 2021-01-04 16:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/0f6779d0-3a2e-4a94-be9b-d9d6710a7ea0 -rw-r--r-- 3 user hdfs 23905 2021-01-04 16:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/107b3b74-634a-462c-bf40-1d4886117aa9 -rw-r--r-- 3 user hdfs 78748 2021-01-04 14:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/18a538c6-d40e-48c0-a965-d65be407a124 -rw-r--r-- 3 user hdfs 83550 2021-01-04 16:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/24ed9c4a-0b8e-45d4-95b8-64547cb9c541 -rw-r--r-- 3 user hdfs 23905 2021-01-04 17:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/35ee9665-7c1f-4407-beb5-fbb312d84907 -rw-r--r-- 3 user hdfs 47997 2021-01-04 11:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/36363172-c401-4d60-a970-cfb2b3cbf058 -rw-r--r-- 3 user hdfs 81082 2021-01-04 15:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/43aecc8c-145f-43ba-81a8-b0ce2c3498f4 -rw-r--r-- 3 user hdfs 79898 2021-01-04 15:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/5743f278-fc50-4c4a-b14e-89bfdb2139fa -rw-r--r-- 3 user hdfs 23905 2021-01-04 16:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/67e16688-c48c-409b-acac-e7091a84d548 -rw-r--r-- 3 user hdfs 23905 2021-01-04 16:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/773ef43d-936a-4f33-9b0a-d3ff090637c7 -rw-r--r-- 3 user hdfs 82046 2021-01-04 16:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/81ac58ef-8810-4fa6-ad8f-a5ec0c0cc885 -rw-r--r-- 3 user hdfs 86089 2021-01-04 17:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/8e202c6a-f702-487b-bd00-43739a8c79a2 -rw-r--r-- 3 user hdfs 84875 2021-01-04 17:05 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/a6d4db40-2efe-495c-8e94-a9c31876e4d3 -rw-r--r-- 3 user hdfs 23905 2021-01-04 17:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/b54c5d30-b152-4fba-b0ac-dba598c93646 -rw-r--r-- 3 user hdfs 23905 2021-01-04 15:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/c36433cf-9e79-46ee-a93f-fe042e3c583f -rw-r--r-- 3 user hdfs 23905 2021-01-04 14:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/e8a27366-4764-4ef0-ae6b-85ed936f6935 -rw-r--r-- 3 user hdfs 80747 2021-01-04 15:25 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/eb6476de-1e35-4d0c-bc6b-2f3214abfffd -rw-r--r-- 3 user hdfs 23905 2021-01-04 15:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/efd13c04-cbac-4c68-a132-1f9dc9afc7b4 -rw-r--r-- 3 user hdfs 23905 2021-01-04 14:45 /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/f63ba16a-6664-49b6-878f-efba342270be And resuming from a checkpoint directory (e.g. /flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-954) is perfectly working as wished. So I'm wondering - is every checkpoint already meant to have its metadata on HDFS even without setting the value of "execution-checkpointing-externalized-checkpoint-retention"? - is setting "execution-checkpointing-externalized-checkpoint-retention" only needed when I want to retain checkpoints in case a job fails or is intentionally cancelled? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#execution-checkpointing-externalized-checkpoint-retention [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/checkpointing.html Best, Dongwon