Hello. We are running Flink 1.20.1 on Kubernetes (AKS). We have observed a consistent error situation: both checkpoints and savepoints only save “_metadata” file and nothing else. Sometimes this is OK, where all data is in that one file. But sometimes “_metadata” holds references to other files, which are not present.
I understand that if the size of the state is smaller than a set limit, it will be stored only in that one file. And if it is larger, it would be spilled over to additional files. Our state is generally miniscule, so it should always fit into _metadata, but sometimes I can inspect the _metadata file and see references to those additional files. Trying to restore from such a save/check-point always fails. Does anyone know of a reason for this behavior? This is our configuration (relevant parts, I have substituted our account with a variable): high-availability.type: kubernetes high-availability.cluster-id: flink-cluster-session-cluster high-availability.storageDir: wasbs://flink-storage@${account}.blob.core.windows.net/data high-availability.jobmanager.port: 6123 state.backend.type: rocksdb execution.checkpointing.num-retained: 3 execution.checkpointing.savepoint-dir: wasbs://flink-storage@${account}.blob.core.windows.net/flink-savepoints execution.checkpointing.mode: EXACTLY_ONCE execution.checkpointing.incremental: true execution.checkpointing.interval: 60000 execution.checkpointing.timeout: 300000 $internal.flink.version: v1_20 execution.checkpointing.storage: filesystem execution.checkpointing.dir: wasbs://flink-storage@${account}.blob.core.windows.net/flink-checkpoints execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION execution.checkpointing.min-pause: 5000 execution.target: kubernetes-session fs.azure.account.keyprovider.${account}.blob.core.windows.net: org.apache.flink.fs.azurefs.EnvironmentVariableKeyProvider env.java.opts.all: --add-exports=java.base/sun.net.util=ALL-UNNAMED --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED Nikola.