Thanks. Which RocksDb files are growing indefinitely, the LOG or SST ones? Also, any chance you could use the latest streams library 0.10.2.1 to check if problem still exists?
Eno > On 16 May 2017, at 16:43, Vincent Bernardi <vinc...@kameleoon.com> wrote: > > Just tried setting compaction threads to 5, but I have the exact same > problem: the rocksdb files get bigger and bigger, while my application > never stores more than 200k K/V pairs. > > V. > > On Tue, May 16, 2017 at 5:22 PM, Vincent Bernardi <vinc...@kameleoon.com> > wrote: > >> Hi Eno, >> Thanks for your answer. I tried sending a followup email when I realised I >> forgot to tell you the version number but it must have fallen through. >> I'm using 0.10.1.1 both for Kafka and for the streams library. >> Currently my application works on 4 partitions and only uses about 100% of >> one core, so I don't see how it could be CPU starved. Still I will of >> course try your suggestion. >> >> Thanks again, >> V. >> >> >> On Tue, May 16, 2017 at 5:15 PM, Eno Thereska <eno.there...@gmail.com> >> wrote: >> >>> Which version of Kafka are you using? It might be that RocksDb doesn't >>> get enough resources to compact the data fast enough. If that's the case >>> you can try increasing the number of background compaction threads for >>> RocksDb through the RocksDbConfigSetter class (see >>> http://docs.confluent.io/current/streams/developer-guide. >>> html#streams-developer-guide-rocksdb-config < >>> http://docs.confluent.io/current/streams/developer-guide. >>> html#streams-developer-guide-rocksdb-config>) by calling >>> "options.setIncreaseParallelism(/* number of threads for compaction, >>> e.g., 5 */)" >>> >>> Eno >>> >>>> On 16 May 2017, at 14:58, Vincent Bernardi <vinc...@kameleoon.com> >>> wrote: >>>> >>>> Hi, >>>> I'm running an experimental Kafka Stream Processor which accumulates >>> lots >>>> of data in a StateStoreSupplier during transform() and forwards lots of >>>> data during punctuate (and deletes it form the StateStoreSupplier). I'm >>>> currently using a persistent StateStore, meaning that Kafka Streams >>>> provides me with a RocksDB instance which writes everything on disk. The >>>> average amount of data that I keep in my StateStore at any time is at >>> most >>>> 1GB. >>>> >>>> My problem is that it seems that this data is never really deleted, as >>> if >>>> no compaction never happened: the directory size for my RocksDB instance >>>> goes ever up and eventually uses up all disk space at which point my >>>> application crashes (I've seen it go up to 60GB before I stopped it). >>>> >>>> Does anyone know if this can be a normal behaviour for RocksDB? Is there >>>> any way that I can manually log or trigger RocksDB compactions to see if >>>> that is my problem? >>>> >>>> Thanks in advance for any pointer, >>>> V. >>> >>> >>