Re: Streams RocksDB State Store Disk Usage

2016-07-07 Thread Guozhang Wang
I find this tuning guide in RocksDB quite useful, regarding your write / space amplifications. https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide Guozhang On Thu, Jun 30, 2016 at 8:36 AM, Avi Flax wrote: > On Jun 29, 2016, at 22:44, Guozhang Wang wrote: > > > > One way to mentally

Re: Streams RocksDB State Store Disk Usage

2016-06-30 Thread Avi Flax
On Jun 29, 2016, at 22:44, Guozhang Wang wrote: > > One way to mentally quantify your state store usage is to consider the > total key space in your reduceByKey() operator, and multiply by the average > key-value pair size. Then you need to consider the RocksDB write / space > amplification facto

Re: Streams RocksDB State Store Disk Usage

2016-06-29 Thread Guozhang Wang
Hello Avi, One way to mentally quantify your state store usage is to consider the total key space in your reduceByKey() operator, and multiply by the average key-value pair size. Then you need to consider the RocksDB write / space amplification factor as well. Currently Kafka Streams hard-write s

Re: Streams RocksDB State Store Disk Usage

2016-06-29 Thread Avi Flax
On Jun 29, 2016, at 14:15, Matthias J. Sax wrote: > > If you use window-operations, windows are kept until there retention > time expires. Thus, reducing the retention time, should decrease the > memory RocksDB needs to preserve windows. Thanks Matthias, that makes sense and I appreciate all the

Re: Streams RocksDB State Store Disk Usage

2016-06-29 Thread Avi Flax
On Jun 29, 2016, at 11:49, Eno Thereska wrote: > These are internal files to RockDb. Yeah, that makes sense. However, since Streams is encapsulating/employing RocksDB, in my view it’s Streams’ responsibility to configure RocksDB well with good defaults and/or at least provide a way for me to

Re: Streams RocksDB State Store Disk Usage

2016-06-29 Thread Matthias J. Sax
One thing I want to add: If you use window-operations, windows are kept until there retention time expires. Thus, reducing the retention time, should decrease the memory RocksDB needs to preserve windows. See http://docs.confluent.io/3.0.0/streams/developer-guide.html?highlight=retention#windowin

Re: Streams RocksDB State Store Disk Usage

2016-06-29 Thread Eno Thereska
Hi Avi, These are internal files to RockDb. Depending on your load in the system I suppose they could contain quite a bit of data. How large was the load in the system these past two weeks so we can calibrate? Otherwise I'm not sure if 1-2GB is a lot or not (sounds like not that big to make the