I just upgraded Kafka Streams to 0.10.2.1 and have the exact same symptom: new SST files keep getting created and old ones are never deleted. Note that when I cleanly exit my streams application, all disk space is almost instantly reclaimed, and the total size of the database becomes about the amount of data I store in it (~1GB).
V. On Tue, May 16, 2017 at 10:13 PM, Eno Thereska <eno.there...@gmail.com> wrote: > 0.10.2.1 is compatible with Kafka 0.10.1. > > Eno > > On 16 May 2017, at 20:45, Vincent Bernardi <vinc...@kameleoon.com> > wrote: > > > > The LOG files stay small. The SST files are growing but not in size, in > > numbers. Old .sst files seem never written to anymore but are not deleted > > and new ones appear regularly. > > I can certainly try streams 0.10.2.1 if it's compatible with Kafka > 0.10.1. > > I have not checked the compatibility matrix yet. > > > > Thanks for the help, > > V. > > > > On Tue, 16 May 2017 at 17:57, Eno Thereska <eno.there...@gmail.com> > wrote: > > > >> Thanks. Which RocksDb files are growing indefinitely, the LOG or SST > ones? > >> Also, any chance you could use the latest streams library 0.10.2.1 to > >> check if problem still exists? > >> > >> > >> Eno > >> > >>> On 16 May 2017, at 16:43, Vincent Bernardi <vinc...@kameleoon.com> > >> wrote: > >>> > >>> Just tried setting compaction threads to 5, but I have the exact same > >>> problem: the rocksdb files get bigger and bigger, while my application > >>> never stores more than 200k K/V pairs. > >>> > >>> V. > >>> > >>> On Tue, May 16, 2017 at 5:22 PM, Vincent Bernardi < > vinc...@kameleoon.com > >>> > >>> wrote: > >>> > >>>> Hi Eno, > >>>> Thanks for your answer. I tried sending a followup email when I > >> realised I > >>>> forgot to tell you the version number but it must have fallen through. > >>>> I'm using 0.10.1.1 both for Kafka and for the streams library. > >>>> Currently my application works on 4 partitions and only uses about > 100% > >> of > >>>> one core, so I don't see how it could be CPU starved. Still I will of > >>>> course try your suggestion. > >>>> > >>>> Thanks again, > >>>> V. > >>>> > >>>> > >>>> On Tue, May 16, 2017 at 5:15 PM, Eno Thereska <eno.there...@gmail.com > > > >>>> wrote: > >>>> > >>>>> Which version of Kafka are you using? It might be that RocksDb > doesn't > >>>>> get enough resources to compact the data fast enough. If that's the > >> case > >>>>> you can try increasing the number of background compaction threads > for > >>>>> RocksDb through the RocksDbConfigSetter class (see > >>>>> http://docs.confluent.io/current/streams/developer-guide. > >>>>> html#streams-developer-guide-rocksdb-config < > >>>>> http://docs.confluent.io/current/streams/developer-guide. > >>>>> html#streams-developer-guide-rocksdb-config>) by calling > >>>>> "options.setIncreaseParallelism(/* number of threads for compaction, > >>>>> e.g., 5 */)" > >>>>> > >>>>> Eno > >>>>> > >>>>>> On 16 May 2017, at 14:58, Vincent Bernardi <vinc...@kameleoon.com> > >>>>> wrote: > >>>>>> > >>>>>> Hi, > >>>>>> I'm running an experimental Kafka Stream Processor which accumulates > >>>>> lots > >>>>>> of data in a StateStoreSupplier during transform() and forwards lots > >> of > >>>>>> data during punctuate (and deletes it form the StateStoreSupplier). > >> I'm > >>>>>> currently using a persistent StateStore, meaning that Kafka Streams > >>>>>> provides me with a RocksDB instance which writes everything on disk. > >> The > >>>>>> average amount of data that I keep in my StateStore at any time is > at > >>>>> most > >>>>>> 1GB. > >>>>>> > >>>>>> My problem is that it seems that this data is never really deleted, > as > >>>>> if > >>>>>> no compaction never happened: the directory size for my RocksDB > >> instance > >>>>>> goes ever up and eventually uses up all disk space at which point my > >>>>>> application crashes (I've seen it go up to 60GB before I stopped > it). > >>>>>> > >>>>>> Does anyone know if this can be a normal behaviour for RocksDB? Is > >> there > >>>>>> any way that I can manually log or trigger RocksDB compactions to > see > >> if > >>>>>> that is my problem? > >>>>>> > >>>>>> Thanks in advance for any pointer, > >>>>>> V. > >>>>> > >>>>> > >>>> > >> > >> > >