Re: Bloom Filter for Rocksdb

2023-10-29 Thread xiangyu feng
Hi Kean, I would like to share with you our analysis of the pros and cons about enabling Bloomfilter in production. Pros: By enabling BloomFilter, RocksDB.get() can filter out data files that not contains this key for sure and hence reduce some random disk reads. This performance improvement is d

Re: Bloom Filter for Rocksdb

2023-10-29 Thread David Anderson
I believe bloom filters are off by default because they add overhead and aren't always helpful. I.e., in workloads that are write heavy and have few reads, bloom filters aren't worth the overhead. David On Fri, Oct 20, 2023 at 11:31 AM Mate Czagany wrote: > Hi, > > There have been no reports ab

Re: Bloom Filter for Rocksdb

2023-10-20 Thread Mate Czagany
Hi, There have been no reports about setting this configuration causing any issues. I would guess it's off by default because it can increase the memory usage by an unpredictable amount. I would say feel free to enable it, from what you've said I also think that this would improve the performance

Re: Bloom Filter for Rocksdb

2023-10-20 Thread Kartoglu, Emre
I don’t know much about the performance improvements that may come from using bloom filters, but I believe you can also improve RocksDB performance by increasing managed memory https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-memory-managed-fraction which