Hi Kean,
I would like to share with you our analysis of the pros and cons about
enabling Bloomfilter in production.
Pros:
By enabling BloomFilter, RocksDB.get() can filter out data files that not
contains this key for sure and hence reduce some random disk reads. This
performance improvement is d
I believe bloom filters are off by default because they add overhead and
aren't always helpful. I.e., in workloads that are write heavy and have few
reads, bloom filters aren't worth the overhead.
David
On Fri, Oct 20, 2023 at 11:31 AM Mate Czagany wrote:
> Hi,
>
> There have been no reports ab
Hi,
There have been no reports about setting this configuration causing any
issues. I would guess it's off by default because it can increase the
memory usage by an unpredictable amount.
I would say feel free to enable it, from what you've said I also think that
this would improve the performance
I don’t know much about the performance improvements that may come from using
bloom filters, but I believe you can also improve RocksDB performance by
increasing managed memory
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-memory-managed-fraction
which