Hi Kien, >From my point of view, RocksDB native metrics could be classified into 5 parts >below, and you could select what you're interested in to enable. Enable those >metrics could cause about 10% performance regression, and this might impact >the overall performance as not all jobs are state-access bottleneck.
Performance related: state.backend.rocksdb.metrics.actual-delayed-write-rate state.backend.rocksdb.metrics.is-write-stopped Compaction & flush related, which will impact the memory usage and write stall: state.backend.rocksdb.metrics.mem-table-flush-pending state.backend.rocksdb.metrics.num-running-flushes state.backend.rocksdb.metrics.compaction-pending state.backend.rocksdb.metrics.num-running-compactions Memory usage status: state.backend.rocksdb.metrics.block-cache-usage (If Flink's managed memory over RocksDB is enabled, this value would be the same for all column families in the same slot) state.backend.rocksdb.metrics.cur-size-all-mem-tables DB static properties: state.backend.rocksdb.metrics.block-cache-capacity DB number of keys and data usage: state.backend.rocksdb.metrics.estimate-live-data-size state.backend.rocksdb.metrics.total-sst-files-size BTW, state.backend.rocksdb.metrics.column-family-as-variable is not rocksDB internal metrics but to expose column family as variable so that we could classify different state status. Best Yun Tang ________________________________ From: Steven Wu <stevenz...@gmail.com> Sent: Wednesday, December 9, 2020 12:11 To: Khachatryan Roman <khachatryan.ro...@gmail.com> Cc: Truong Duc Kien <duckientru...@gmail.com>; Yun Tang <myas...@live.com>; user <user@flink.apache.org> Subject: Re: Recommendation about RocksDB Metrics ? just a data point. we actually enabled all RocksDb metrics by default (including very large jobs in terms of parallelism and state size). We didn't see any significant performance impact. There is probably a small impact. At least, it didn't jump out for our workload. On Tue, Dec 8, 2020 at 9:00 AM Khachatryan Roman <khachatryan.ro...@gmail.com<mailto:khachatryan.ro...@gmail.com>> wrote: Hi Kien, I am pulling in Yun who might know better. Regards, Roman On Sun, Dec 6, 2020 at 3:52 AM Truong Duc Kien <duckientru...@gmail.com<mailto:duckientru...@gmail.com>> wrote: Hi all, We are thinking about enabling RocksDB metrics to better monitor our pipeline. However, since they will have performance impact, we will have to be selective about which metrics we use. Does anyone have experience about which metrics are more important than the others ? And what metrics have the largest performance impact ? Thanks, Kien