Re: Kafka Streams RocksDB permanent StateStoreSupplier seemingly never deletes data

Eno Thereska Thu, 18 May 2017 01:32:08 -0700

Hi Vincent,

Could you share your code, the part where you write to the state store and then 
delete. I'm wondering if you have iterators in your code that need to be 
closed().


Eno
> On 16 May 2017, at 16:22, Vincent Bernardi <vinc...@kameleoon.com> wrote:
> 
> Hi Eno,
> Thanks for your answer. I tried sending a followup email when I realised I
> forgot to tell you the version number but it must have fallen through.
> I'm using 0.10.1.1 both for Kafka and for the streams library.
> Currently my application works on 4 partitions and only uses about 100% of
> one core, so I don't see how it could be CPU starved. Still I will of
> course try your suggestion.
> 
> Thanks again,
> V.
> 
> 
> On Tue, May 16, 2017 at 5:15 PM, Eno Thereska <eno.there...@gmail.com>
> wrote:
> 
>> Which version of Kafka are you using? It might be that RocksDb doesn't get
>> enough resources to compact the data fast enough. If that's the case you
>> can try increasing the number of background compaction threads for RocksDb
>> through the RocksDbConfigSetter class (see http://docs.confluent.io/
>> current/streams/developer-guide.html#streams-developer-
>> guide-rocksdb-config <http://docs.confluent.io/current/streams/developer-
>> guide.html#streams-developer-guide-rocksdb-config>) by calling 
>> "options.setIncreaseParallelism(/*
>> number of threads for compaction, e.g., 5 */)"
>> 
>> Eno
>> 
>>> On 16 May 2017, at 14:58, Vincent Bernardi <vinc...@kameleoon.com>
>> wrote:
>>> 
>>> Hi,
>>> I'm running an experimental Kafka Stream Processor which accumulates lots
>>> of data in a StateStoreSupplier during transform() and forwards lots of
>>> data during punctuate (and deletes it form the StateStoreSupplier). I'm
>>> currently using a persistent StateStore, meaning that Kafka Streams
>>> provides me with a RocksDB instance which writes everything on disk. The
>>> average amount of data that I keep in my StateStore at any time is at
>> most
>>> 1GB.
>>> 
>>> My problem is that it seems that this data is never really deleted, as if
>>> no compaction never happened: the directory size for my RocksDB instance
>>> goes ever up and eventually uses up all disk space at which point my
>>> application crashes (I've seen it go up to 60GB before I stopped it).
>>> 
>>> Does anyone know if this can be a normal behaviour for RocksDB? Is there
>>> any way that I can manually log or trigger RocksDB compactions to see if
>>> that is my problem?
>>> 
>>> Thanks in advance for any pointer,
>>> V.
>> 
>>

Re: Kafka Streams RocksDB permanent StateStoreSupplier seemingly never deletes data

Reply via email to