Hi Jun, Some predefined options would also activate bloom filters, e.g. PredefinedOptions#SPINNING_DISK_OPTIMIZED_HIGH_MEM, but I think offering configurable option is good idea. +1 for this.
When talking about the bloom filter default value, I slight prefer to use full format [1] instead of old block format. This is related with FLINK-20496 [2] which try to add option to enable partitioned index & filter. [1] https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter#full-filters-new-format [2] https://issues.apache.org/jira/browse/FLINK-20496 Best Yun Tang ________________________________ From: Till Rohrmann <trohrm...@apache.org> Sent: Monday, February 8, 2021 17:06 To: dev <dev@flink.apache.org> Subject: Re: Activate bloom filter in RocksDB State Backend via Flink configuration Hi Jun, Making things easier to use and configure is a good idea. Hence, +1 for this proposal. Maybe create a JIRA ticket for it. For the concrete default values it would be nice to hear the opinion of a RocksDB expert. Cheers, Till On Sun, Feb 7, 2021 at 7:23 PM Jun Qin <qinjunje...@gmail.com> wrote: > Hi, > > Activating bloom filter in the RocksDB state backend improves read > performance. Currently activating bloom filter can only be done by > implementing a custom ConfigurableRocksDBOptionsFactory. I think we should > provide an option to activate bloom filter via Flink configuration. What > do you think? If so, what about the following configuration? > > state.backend.rocksdb.bloom-filter.enabled: false (default) > state.backend.rocksdb.bloom-filter.bits-per-key: 10 (default) > state.backend.rocksdb.bloom-filter.block-based: true (default) > > > Thanks > Jun