If I understand well RocksDB is using two disk, the Task Manager local disk for "local storage" of the state and the distributed disk for checkpointing.
Two questions: - if I have 3 TaskManager I should expect more or less (depending on how the tasks are balanced) to find a third of my overall state stored on disk on each of this TaskManager node? - if the local node/disk fails I will get the state back from the distributed disk and things will start again and all is fine. However what happens if the distributed disk fails? Will Flink continue processing waiting for me to mount a new distributed disk? Or will it stop? May I lose data/reprocess things under that condition? -- Christophe Jolif