Hi Dongwon,

   Happy new year! One meta file would be stored on top of HDFS even if 
external-checkpoint is not enabled. If external checkpoint is not enabled, 
flink would delete all the checkpoints on exit, and if external checkpoint is 
enabled, the checkpoints would be kept on cancel or fail cases, according to 
the settings. Thus for the second issue, I think it would be yes.

Best,
 Yun


 ------------------Original Mail ------------------
Sender:Dongwon Kim <eastcirc...@gmail.com>
Send Date:Mon Jan 4 19:16:39 2021
Recipients:user <user@flink.apache.org>
Subject:Is chk-$id/_metadata created regardless of enabling externalized 
checkpoints?

Hi,

First of all, happy new year!
It can be a very basic question but I have something to clarify in my head.

my flink-conf.yaml is as follows (note that I didn't specify the value of 
"execution-checkpointing-externalized-checkpoint-retention [1]"):
#...
execution.checkpointing.interval: 20min
execution.checkpointing.min-pause: 1min

state.backend: rocksdb
state.backend.incremental: true

state.checkpoints.dir: hdfs:///flink-jobs/ckpts
state.checkpoints.num-retained: 10

state.savepoints.dir: hdfs:///flink-jobs/svpts
#...

And the checkpoint configuration is shown as follows in Web UI (note that 
"Persist Checkpoints Externally" is "Disabled" in the final row):


According to [2],
externalized checkpoints: You can configure periodic checkpoints to be 
persisted externally. Externalized checkpoints write their meta data out to 
persistent storage and are not automatically cleaned up when the job fails. 
This way, you will have a checkpoint around to resume from if your job fails. 
There are more details in the deployment notes on externalized checkpoints.
So I've thought the metadata of a checkpoint is only on JobManager's memory and 
not stored on HDFS unless 
"execution-checkpointing-externalized-checkpoint-retention" is set.

However, even without setting the value, every checkpoint already contains its 
own metadata:
[user@devflink conf]$ hdfs dfs -ls 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/*
Found 1 items
-rw-r--r--  3 user hdfs  163281 2021-01-04 14:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-945/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  163281 2021-01-04 14:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-946/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  163157 2021-01-04 15:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-947/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  156684 2021-01-04 15:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-948/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  147280 2021-01-04 15:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-949/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  147280 2021-01-04 16:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-950/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  162937 2021-01-04 16:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-951/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  175089 2021-01-04 16:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-952/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  173289 2021-01-04 17:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-953/_metadata
Found 1 items
-rw-r--r--  3 user hdfs  153951 2021-01-04 17:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-954/_metadata
Found 21 items
-rw-r--r--  3 user hdfs 78748 2021-01-04 14:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/05d76f4e-3d9c-420c-8b87-077fc9880d9a
-rw-r--r--  3 user hdfs 23905 2021-01-04 15:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/0b9d9323-9f10-4fc2-8fcc-a9326448b07c
-rw-r--r--  3 user hdfs 81082 2021-01-04 16:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/0f6779d0-3a2e-4a94-be9b-d9d6710a7ea0
-rw-r--r--  3 user hdfs 23905 2021-01-04 16:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/107b3b74-634a-462c-bf40-1d4886117aa9
-rw-r--r--  3 user hdfs 78748 2021-01-04 14:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/18a538c6-d40e-48c0-a965-d65be407a124
-rw-r--r--  3 user hdfs 83550 2021-01-04 16:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/24ed9c4a-0b8e-45d4-95b8-64547cb9c541
-rw-r--r--  3 user hdfs 23905 2021-01-04 17:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/35ee9665-7c1f-4407-beb5-fbb312d84907
-rw-r--r--  3 user hdfs 47997 2021-01-04 11:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/36363172-c401-4d60-a970-cfb2b3cbf058
-rw-r--r--  3 user hdfs 81082 2021-01-04 15:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/43aecc8c-145f-43ba-81a8-b0ce2c3498f4
-rw-r--r--  3 user hdfs 79898 2021-01-04 15:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/5743f278-fc50-4c4a-b14e-89bfdb2139fa
-rw-r--r--  3 user hdfs 23905 2021-01-04 16:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/67e16688-c48c-409b-acac-e7091a84d548
-rw-r--r--  3 user hdfs 23905 2021-01-04 16:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/773ef43d-936a-4f33-9b0a-d3ff090637c7
-rw-r--r--  3 user hdfs 82046 2021-01-04 16:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/81ac58ef-8810-4fa6-ad8f-a5ec0c0cc885
-rw-r--r--  3 user hdfs 86089 2021-01-04 17:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/8e202c6a-f702-487b-bd00-43739a8c79a2
-rw-r--r--  3 user hdfs 84875 2021-01-04 17:05 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/a6d4db40-2efe-495c-8e94-a9c31876e4d3
-rw-r--r--  3 user hdfs 23905 2021-01-04 17:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/b54c5d30-b152-4fba-b0ac-dba598c93646
-rw-r--r--  3 user hdfs 23905 2021-01-04 15:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/c36433cf-9e79-46ee-a93f-fe042e3c583f
-rw-r--r--  3 user hdfs 23905 2021-01-04 14:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/e8a27366-4764-4ef0-ae6b-85ed936f6935
-rw-r--r--  3 user hdfs 80747 2021-01-04 15:25 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/eb6476de-1e35-4d0c-bc6b-2f3214abfffd
-rw-r--r--  3 user hdfs 23905 2021-01-04 15:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/efd13c04-cbac-4c68-a132-1f9dc9afc7b4
-rw-r--r--  3 user hdfs 23905 2021-01-04 14:45 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/shared/f63ba16a-6664-49b6-878f-efba342270be

And resuming from a checkpoint directory (e.g. 
/flink-jobs/ckpts/76fc265c44ef44ae343ab15868155de6/chk-954) is perfectly 
working as wished.

So I'm wondering
- is every checkpoint already meant to have its metadata on HDFS even without 
setting the value of 
"execution-checkpointing-externalized-checkpoint-retention"?
- is setting "execution-checkpointing-externalized-checkpoint-retention" only 
needed when I want to retain checkpoints in case a job fails or is 
intentionally cancelled?

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#execution-checkpointing-externalized-checkpoint-retention
[2] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/checkpointing.html

Best,

Dongwon

Reply via email to