Hello Avi, One way to mentally quantify your state store usage is to consider the total key space in your reduceByKey() operator, and multiply by the average key-value pair size. Then you need to consider the RocksDB write / space amplification factor as well.
Currently Kafka Streams hard-write some RocksDB config values such as block size to achieve good write performance with the cost of write amplification, but we are now working on exposing those configs to the users so that they can override themselves: https://issues.apache.org/jira/browse/KAFKA-3740 Guozhang On Wed, Jun 29, 2016 at 11:59 AM, Avi Flax <avi.f...@parkassist.com> wrote: > On Jun 29, 2016, at 14:15, Matthias J. Sax <matth...@confluent.io> wrote: > > > > If you use window-operations, windows are kept until there retention > > time expires. Thus, reducing the retention time, should decrease the > > memory RocksDB needs to preserve windows. > > Thanks Matthias, that makes sense and I appreciate all the helpful > pointers! This is really good to know. However, the app that’s generating > the large RocksDB log files is not using windowing, just basic aggregation > with reduceByKey. > > Thanks! > Avi -- -- Guozhang