Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-20 Thread Guozhang Wang
First about the metrics attributes, now I remembered there is indeed a change as in https://cwiki.apache.org/confluence/display/KAFKA/KIP-105%3A+Addition+of+Recording+Level+for+Sensors We have added a hierarchy to the sensors, and currently there are only two levels: INFO and DEBUG. Along with it

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-17 Thread Sachin Mittal
Hi, Regrading 0. "frequent rebalance and dying of threads": did you see any warn / error log entries or exceptions when threads die or rebalance is triggered? The exception we get is CommitFailedException and then on partition revoke also it throws CommitFailedException and stream thread is killed

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-17 Thread Guozhang Wang
Hi Sachin, 0. "frequent rebalance and dying of threads": did you see any warn / error log entries or exceptions when threads die or rebalance is triggered? My guess is that you are likely hitting this issue: https://issues.apache.o rg/jira/browse/KAFKA-3775 The scenario is that, when you are sta

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-17 Thread Sachin Mittal
Hi, I am now starting with rocksdb monitoring and comparing 2 identical streams with 2 identical input topics. Case1 Single partition topic with single thread streams apps running 0.10.1.1 This case is pretty stable with hardly any cpu wait time noticed. Case 2 12 partition topic with 3 instances

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-12 Thread Guozhang Wang
Hi Sachin, About the CommitFailedException error and the potential memory leakage, I also would like to know which one may be causing the other. So is it: 1) You saw frequent CommitFailedException, and you handled them in the exception handler by re-creating the instances, and then you observe me

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-10 Thread Mathieu Fenniak
Hi Sachin, Streams apps can be configured with a rocksdb.config.setter, which is a class name that needs to implement the org.apache.kafka.streams.state.RocksDBConfigSetter interface, which can be used to reduce the memory utilization of RockDB. Here's an example class that trims it way down (not

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Sachin Mittal
Hi, We are running rocksdb with default configuration. I would try to monitor the rocks db, I do see the beans when I connect via jmx client. We use rocks db for aggregation. Our pipe line is: input .groupByKey() .aggregate(new Initializer>() { public SortedSet apply() { return new Tr

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Guozhang Wang
Sachin, Thanks for sharing your observations, that are very helpful. Regards to monitoring, there are indeed metrics inside the Streams library to meter state store operations; for rocksdb it records the average latency and callrate for put / get / delete / all / range and flush / restore. You ca

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Sachin Mittal
Hi, We recently upgraded to 0.10.2.0-rc0, the rocksdb issue seems to be resolved however we are just not able to get the streams going under our current scenario. The issue seems to be still related to rocksdb. Let me explain the situation: We have 40 partitions and 12 threads distributes across t

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-08 Thread Sachin Mittal
Hi, Please refer to thread Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown I am adding more details to it. There are 2 questions 1. why we get CommitFailedException in case of member id unknown error returned by broker. 2. even if we get CommitFailedException wh

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-08 Thread Damian Guy
Hi Sachin, It might be helpful if you send the logs from the streams application and the broker. Thanks, Damian On Thu, 9 Feb 2017 at 02:43, Sachin Mittal wrote: > Hi > I have upgraded to 2.10.2.0 however the streams is still falling via commit > failed exception with cause unknown member id. ro

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-08 Thread Sachin Mittal
Hi I have upgraded to 2.10.2.0 however the streams is still falling via commit failed exception with cause unknown member id. rocksdb locks issue sseem to be resolved. Also poll is called within the interval. I have posted more detail of the failure in another thread. Please let us know what coul

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-07 Thread Damian Guy
Hi Sachin, Sorry i misunderstood what you had said. You are running 3 instances, one per machine? I thought you said you were running 3 instances on each machine. Regarding partitions: you are better off having more partitions as this effects the maximum degree of parallelism you can achieve in t

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-06 Thread Sachin Mittal
Hi, Everything is understood and I will try out 0.10.2.0-rc0 shortly. However one this is not clear: Firstly i'd recommend you have different state directory configs for each application instance. Well I am running three separate instance of 4 threads each on three different machines. So each mac

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-06 Thread Damian Guy
Hi Sachin, Firstly i'd recommend you have different state directory configs for each application instance. I suspect you are potentially hitting an issue where the partition assignment has changed, the state directory locks get released, and i directory gets removed just before the lock is taken o

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-06 Thread Sachin Mittal
Hi, Yes on first we have three machines with same data directory setting. So the state dir config is same in for each. If it helps this is the sequence of logs just before the thread shutting down stream-thread [StreamThread-3] Committing all tasks because the commit interval 3ms has el

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-06 Thread Damian Guy
Hi Sachin, The first exception - Is each instance of your streams app on a single machine running with the same state directory config? The second exception - i believe is a bug in 0.10.1 that has been fixed in 0.10.2. There has been a number of issues fixed in this area. Thanks, Damian On Mon,

Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-05 Thread Sachin Mittal
Hello All, We recently upgraded to kafka_2.10-0.10.1.1. We have a source topic with replication = 3 and partition = 40. We have a streams application run with NUM_STREAM_THREADS_CONFIG = 4 and on three machines. So 12 threads in total. What we do is start the same streams application one by one