Hey Yaroslav,

Unfortunately I don't have enough knowledge to give you an educated
reply. The first part certainly does make sense to me, but I am not sure
how to mitigate the issue. I am ccing Yun Tang who worked more on the
RocksDB state backend (It might take him a while to answer though, as he
is on vacation right now).

Best,

Dawid

On 14/02/2021 06:57, Yaroslav Tkachenko wrote:
> Hello,
>
> I observe throughput degradation when my pipeline reaches the maximum
> of the allocated block cache. 
>
> The pipeline is consuming from a few Kafka topics at a high rate
> (100k+ rec/s). Almost every processed message results in a (keyed)
> state read with an optional write. I've enabled native RocksDB metrics
> and noticed that everything stays stable until the block cache usage
> reaches maximum. If I understand correctly, this makes sense: this
> cache is used for all reads and cache misses could mean reading data
> on disk, which is much slower (I haven't switched to SSDs yet). Does
> it make sense? 
>
> One thing I know about the messages I consume: I expect very few keys
> to be active simultaneously, most of them can be treated as cold. So
> I'd love RocksDB block cache to have a TTL option (say, 30 minutes),
> which, I imagine, could solve this issue by guaranteeing to only keep
> active keys in memory. I don't feel like LRU is doing a very good job
> here... I couldn't find any option like that, but I'm wondering if
> someone could recommend something similar.
>
> Thank you!
>
> -- 
> Yaroslav Tkachenko
> sap1ens.com <https://sap1ens.com>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to