Hi Banu,

I'm trying to answer your question in brief:

1. Yes, when the memtable reaches the value you configured, a flush will be
triggered. And no, sst files have different format with memtables, the size
is smaller than 64mb IIUC.

2. Typically you don't need to change this value. If it is set to 2, when 1
write buffer is being flushed to storage, new writes can continue to the
other write buffer. Increase this when the flush is too slow.

3. IIUC, bloom filter helps during point query, and window processing
requires point queries. So enabling this would help.

4. I'd suggest not setting this to 0. This only affects whether the
checkpoint data is stored inline in the metadata file. Maybe the checkpoint
size is a little bit different, but it has nothing to do with the
throughput.


Best,
Zakelly

On Thu, Jul 25, 2024 at 3:25 PM banu priya <banuke...@gmail.com> wrote:

> Hi All,
>
> I have a flink job with RMQ Source, filters, tumbling window(uses
> processing time fires every 2s), aggregator, RMQ Sink. Enabled incremental
> rocksdb checkpoints for every 10s with minimum pause between checkpoints as
> 5s. My checkpoints size is keep on increasing , so I am planning to tune
> some rocksdb configuration.
>
>
>
> Following are my queries. Can someone help me choose a correct values.?
>
>
>
> 1.state.backend.rocksdb.writebuffer.size = 64 mb:
>
> Does it mean once write buffer (memtable) reaches 64 mb it will be flushed
> to disk as .sst file. Will .sst file also have size as 64mb?
>
>
>
> 2.state.backend.rocksdb.writebuffer.count = 2.
>
> My job is running with parallelism of 15 and 3 taskmanager(so 5 slots per
> taskmanager).  For single rocks DB folder, how can I choose the correct
> buffer count.?
>
> 3. do I need to enable bloom filter?
>
>  4. state.storage.fs.memory-threshold is 0 in my job. Does it have any
> effect in Taskmanager through put or check points size??
>
> Thanks
>
> Banu
>

Reply via email to