Re: Threads waiting on LocalBufferPool

2016-04-22 Thread Maciek Próchniak
On 21/04/2016 16:46, Aljoscha Krettek wrote: Hi, I would be very happy about improvements to our RocksDB performance. What are the RocksDB Java benchmarks that you are running? In Flink, we also have to serialize/deserialize every time that we access RocksDB using our TypeSerializer. Maybe t

Re: Threads waiting on LocalBufferPool

2016-04-21 Thread Aljoscha Krettek
Hi, I would be very happy about improvements to our RocksDB performance. What are the RocksDB Java benchmarks that you are running? In Flink, we also have to serialize/deserialize every time that we access RocksDB using our TypeSerializer. Maybe this is causing the slow down. By the way, what is t

Re: Threads waiting on LocalBufferPool

2016-04-21 Thread Maciek Próchniak
Well... I found some time to look at rocksDB performance. It takes around 0.4ms to lookup value state and 0.12ms to update - these are means, 95th percentile was > 1ms for get... When I set additional options: .setIncreaseParallelism(8) .setMaxOpenFiles(-1) .setCo

Re: Threads waiting on LocalBufferPool

2016-04-20 Thread Maciek Próchniak
Hi Ufuk, thanks for quick reply. Actually I had a little time to try both things. 1) helped only temporarily - it just took a bit longer to saturate the pool. After few minutes, periodically all kafka threads were waiting for bufferPool. 2) This seemed to help. I also reduced checkpoint interva

Re: Threads waiting on LocalBufferPool

2016-04-20 Thread Ufuk Celebi
Could be different things actually, including the parts of the network you mentioned. 1) Regarding the TM config: - It can help to increase the number of network buffers (you can go ahead and give it 4 GB, e.g. 134217728 buffers a 32 KB) - In general, you have way more memory available than you a

Threads waiting on LocalBufferPool

2016-04-20 Thread Maciek Próchniak
Hi, I'm running my flink job on one rather large machine (20 cores with hyperthreading, 120GB RAM). Task manager has 20GB heap allocated. It does more or less: read csv from kafka -> keyBy one of the fields -> some custom state processing. Kafka topic has 24 partitions, so my parallelism is al