[ https://issues.apache.org/jira/browse/FLINK-20496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
YufeiLiu updated FLINK-20496: ----------------------------- Comment: was deleted (was: What do you think about this, and I'd like to take this ticket if it's necessary. [~ liyu]) > RocksDB partitioned index filter option > --------------------------------------- > > Key: FLINK-20496 > URL: https://issues.apache.org/jira/browse/FLINK-20496 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Reporter: YufeiLiu > Priority: Major > > When using RocksDBStateBackend and enabling > {{state.backend.rocksdb.memory.managed}} and > {{state.backend.rocksdb.memory.fixed-per-slot}}, flink will strictly limited > rocksdb memory usage which contains "write buffer" and "block cache". With > these options rocksdb stores index and filters in block cache, because in > default options index/filters can grows unlimited. > But it's lead another issue, if high-priority cache(configure by > {{state.backend.rocksdb.memory.high-prio-pool-ratio}}) can't fit all > index/filters blocks, it will load all metadata from disk when cache missed, > and program went extremely slow. According to [Partitioned Index > Filters|https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters][1], > we can enable two-level index having acceptable performance when > index/filters cache missed. > Enable these options can get over 10x faster in my case[2], I think we can > add an option {{state.backend.rocksdb.partitioned-index-filters}} and default > value is false, so we can use this feature easily. > [1] Partitioned Index Filters: > https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters > [2] Deduplicate scenario, state.backend.rocksdb.memory.fixed-per-slot=256M, > SSD, elapsed time 4.91ms -> 0.33ms. -- This message was sent by Atlassian Jira (v8.3.4#803005)