Hello.

We are running Flink 1.20.1 on Kubernetes (AKS). We have observed a consistent 
error situation: both checkpoints and savepoints only save “_metadata” file and 
nothing else. Sometimes this is OK, where all data is in that one file. But 
sometimes “_metadata” holds references to other files, which are not present.

I understand that if the size of the state is smaller than a set limit, it will 
be stored only in that one file. And if it is larger, it would be spilled over 
to additional files. Our state is generally miniscule, so it should always fit 
into _metadata, but sometimes I can inspect the _metadata file and see 
references to those additional files. Trying to restore from such a 
save/check-point always fails.

Does anyone know of a reason for this behavior?

This is our configuration (relevant parts, I have substituted our account with 
a variable):



high-availability.type: kubernetes

high-availability.cluster-id: flink-cluster-session-cluster

high-availability.storageDir: 
wasbs://flink-storage@${account}.blob.core.windows.net/data

high-availability.jobmanager.port: 6123



state.backend.type: rocksdb

execution.checkpointing.num-retained: 3

execution.checkpointing.savepoint-dir: 
wasbs://flink-storage@${account}.blob.core.windows.net/flink-savepoints

execution.checkpointing.mode: EXACTLY_ONCE

execution.checkpointing.incremental: true

execution.checkpointing.interval: 60000

execution.checkpointing.timeout: 300000

$internal.flink.version: v1_20

execution.checkpointing.storage: filesystem

execution.checkpointing.dir: 
wasbs://flink-storage@${account}.blob.core.windows.net/flink-checkpoints

execution.checkpointing.externalized-checkpoint-retention: 
RETAIN_ON_CANCELLATION

execution.checkpointing.min-pause: 5000

execution.target: kubernetes-session



fs.azure.account.keyprovider.${account}.blob.core.windows.net: 
org.apache.flink.fs.azurefs.EnvironmentVariableKeyProvider



env.java.opts.all: --add-exports=java.base/sun.net.util=ALL-UNNAMED  
--add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED  
--add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED  
--add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED  
--add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED  
--add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED  
--add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED  
--add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED  
--add-opens=java.base/java.lang=ALL-UNNAMED  
--add-opens=java.base/java.net=ALL-UNNAMED  
--add-opens=java.base/java.io=ALL-UNNAMED  
--add-opens=java.base/java.nio=ALL-UNNAMED  
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED  
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED  
--add-opens=java.base/java.text=ALL-UNNAMED  
--add-opens=java.base/java.time=ALL-UNNAMED  
--add-opens=java.base/java.util=ALL-UNNAMED  
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED  
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED  
--add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED

Nikola.

Reply via email to