Hi all,

We're interested in doing some analysis on how the size of our savepoints
and state affects the time it takes to restore from a savepoint. We're
running Flink 1.12 and using RocksDB as a state backend, on Kubernetes.

What is the best way to measure the size of a Flink Application's state? Is
state.backend.rocksdb.metrics.total-sst-files-size
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#state-backend-rocksdb-metrics-total-sst-files-size>
the right thing to look at?

We tried looking at state.backend.rocksdb.metrics.total-sst-files-size for
all our operators, after restoring from a savepoint, and we noticed that
the sum of all the sst files sizes is much much smaller than the total size
of our savepoint (7GB vs 10TB).  Where does that discrepancy come from?

Do you have any general advice on correlating savepoint size with restore
times?

Thanks in advance!

Reply via email to