Tianji, Could you provide a third data point, running with RocksDb, but without caching, i.e:
> StateStoreSupplier stateStoreSupplier = Stores.create(storeName) > .withKeys(stringSerde) > .withValues(avroSerde) > .persistent() > .disableLogging() > .build(); Thanks Eno > On 15 Mar 2017, at 13:02, Tianji Li <skyah...@gmail.com> wrote: > > Hi there, > > It seems that the RocksDB state store is quite slow in my case and I wonder > if I did anything wrong. > > I have a topic, that I groupBy() and then aggregate() 50 times. That is, I > will create 50 results topics and a lot more changelog and repartition > topics. > > There are a few things that are weird and here I report one, which is the > State store speed. > > If I use: > > StateStoreSupplier stateStoreSupplier = Stores.create(storeName) > .withKeys(stringSerde) > .withValues(avroSerde) > .inMemory() > .build(); > > Then processing 1 millions records takes around 5 minutes on my coding > computer. > > If I use: > > StateStoreSupplier stateStoreSupplier = Stores.create(storeName) > .withKeys(stringSerde) > .withValues(avroSerde) > .persistent() > .disableLogging() > .enableCaching() > .build(); > > Processing the same 1 million records takes around 10 minutes. > > I believe in the first case, changelog is backed up to Kafka and in the > second case, only RocketsDB is used. > > But why the RocketsDB is so slow? > > Eventually, I am hoping to do windowed aggregation and it seems I have to > use RocketsDB, but given the performance, I am hesitating. > > Thanks > Tianji