I don’t know much about the performance improvements that may come from using 
bloom filters, but I believe you can also improve RocksDB performance by 
increasing managed memory 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-memory-managed-fraction
 which RocksDB uses.



From: Kenan Kılıçtepe <kkilict...@gmail.com>
Date: Friday, 20 October 2023 at 14:51
To: user <user@flink.apache.org>
Subject: [EXTERNAL] Bloom Filter for Rocksdb


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Can someone tell the exact performance effect of enabling bloom filter?
May enabling it cause some unpredictable performance problems?

I read what it is and how it works and it makes sense but  I also asked myself 
why the default value of state.backend.rocksdb.use-bloom-filter is false.

We have a 5 servers flink cluster, processing real time IoT data coming from 5 
million devices and for a lot of jobs, we keep different states for each device.

Sometimes we have performance issues and when I check the flamegraph on the 
test server I always see rocksdb.get() is the blocker. I just want to increase 
rocksdb performance.

Thanks

Reply via email to