Which version of Kafka are you using? It might be that RocksDb doesn't get enough resources to compact the data fast enough. If that's the case you can try increasing the number of background compaction threads for RocksDb through the RocksDbConfigSetter class (see http://docs.confluent.io/current/streams/developer-guide.html#streams-developer-guide-rocksdb-config <http://docs.confluent.io/current/streams/developer-guide.html#streams-developer-guide-rocksdb-config>) by calling "options.setIncreaseParallelism(/* number of threads for compaction, e.g., 5 */)"
Eno > On 16 May 2017, at 14:58, Vincent Bernardi <vinc...@kameleoon.com> wrote: > > Hi, > I'm running an experimental Kafka Stream Processor which accumulates lots > of data in a StateStoreSupplier during transform() and forwards lots of > data during punctuate (and deletes it form the StateStoreSupplier). I'm > currently using a persistent StateStore, meaning that Kafka Streams > provides me with a RocksDB instance which writes everything on disk. The > average amount of data that I keep in my StateStore at any time is at most > 1GB. > > My problem is that it seems that this data is never really deleted, as if > no compaction never happened: the directory size for my RocksDB instance > goes ever up and eventually uses up all disk space at which point my > application crashes (I've seen it go up to 60GB before I stopped it). > > Does anyone know if this can be a normal behaviour for RocksDB? Is there > any way that I can manually log or trigger RocksDB compactions to see if > that is my problem? > > Thanks in advance for any pointer, > V.