Hi Adrian,
Thank you for the additional information!
One reason to have a single folder is that Streams also stores metadata
that refers to all state stores in the state directory. That could be
changed if we have a good reason.
If you have a good idea to solve this issue, please feel free to open a
KIP. Would be glad to discuss such a KIP.
Best,
Bruno
On 19.05.22 15:40, Adrian Tubio wrote:
Hi Bruno,
Thanks a lot for your answer.
I have tried to tune store by store to the best of my ability, and indeed I
have managed to improve considerably. We even changed the disk to a much
faster one. But it's still not enough.
Yes we can try dividing the application up into sub applications to make
use of different disks, but it feels like an artificial solution.
There might be reasons I don't know of to have a single folder for all
stores, but it feels limiting, especially if you consider that you can
plugin other types of stores instead of rocks db which doesn't even use
local disk.
If my CPU is ok, my memory is ok and the only limiting factor is Disk, why
not allow the usage of multiple disks instead?
Especially in cloud deployments in which you can arbitrarily attach
multiple volumes, sometimes it is cheaper to use several cheaper volumes in
parallel than a single very expensive one.
I personally believe that this should be considered for a KIP.
Best regards,
Adrian Tubio
On Thu, May 19, 2022 at 1:49 PM Bruno Cadonna <cado...@apache.org> wrote:
Hi Adrian,
I am afraid that you cannot set the state directory for a single state
store to a different directory than all other stores.
Maybe the following blog post can help you debug and solve your issue:
https://www.confluent.io/blog/how-to-tune-rocksdb-kafka-streams-state-stores-performance
Specifically look at the section "High disk I/O and write stalls":
https://www.confluent.io/blog/how-to-tune-rocksdb-kafka-streams-state-stores-performance/#write-stalls
Best,
Bruno
On 19.05.22 10:56, Adrian Tubio wrote:
Hi there,
My kafka streams topology has one store that is particularly busy, that
alongside other stores in the same topology is exhausting I/O which leads
to write stalls and increased latency.
The amount of compaction that this store does with regards to others is
about 3/4 times more, so we were wondering if, since we have more
disks/volumes available, would it be possible to set a different path for
this store so it falls into a different disk?
I don't seem to be able to find any way to do it, ideally it should be
done
via RocksDbConfigSetter, but that doesn't seem to offer that possibility
as
it seems the state store comes from StateStoreContext which is
initialized
from the STATE_DIR_CONFIG global setting.
Has anyone done something similar?
Best regards,
Adrian Tubio