Hey Yaroslav, Unfortunately I don't have enough knowledge to give you an educated reply. The first part certainly does make sense to me, but I am not sure how to mitigate the issue. I am ccing Yun Tang who worked more on the RocksDB state backend (It might take him a while to answer though, as he is on vacation right now).
Best, Dawid On 14/02/2021 06:57, Yaroslav Tkachenko wrote: > Hello, > > I observe throughput degradation when my pipeline reaches the maximum > of the allocated block cache. > > The pipeline is consuming from a few Kafka topics at a high rate > (100k+ rec/s). Almost every processed message results in a (keyed) > state read with an optional write. I've enabled native RocksDB metrics > and noticed that everything stays stable until the block cache usage > reaches maximum. If I understand correctly, this makes sense: this > cache is used for all reads and cache misses could mean reading data > on disk, which is much slower (I haven't switched to SSDs yet). Does > it make sense? > > One thing I know about the messages I consume: I expect very few keys > to be active simultaneously, most of them can be treated as cold. So > I'd love RocksDB block cache to have a TTL option (say, 30 minutes), > which, I imagine, could solve this issue by guaranteeing to only keep > active keys in memory. I don't feel like LRU is doing a very good job > here... I couldn't find any option like that, but I'm wondering if > someone could recommend something similar. > > Thank you! > > -- > Yaroslav Tkachenko > sap1ens.com <https://sap1ens.com>
signature.asc
Description: OpenPGP digital signature