Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
Thanks Jason. On Fri, Feb 5, 2016 at 10:13 AM, Jason Gustafson wrote: > Hey Rajiv, > > Thanks for all the updates. I think I've been able to reproduce this. The > key seems to be waiting for the old log segment to be deleted. I'll > investigate a bit more and report what I find on the JIRA. > >

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Jason Gustafson
Hey Rajiv, Thanks for all the updates. I think I've been able to reproduce this. The key seems to be waiting for the old log segment to be deleted. I'll investigate a bit more and report what I find on the JIRA. -Jason On Fri, Feb 5, 2016 at 9:50 AM, Rajiv Kurian wrote: > I've updated Kafka-31

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Ismael Juma
Thanks for getting to the bottom of this Rajiv. Ismael On Fri, Feb 5, 2016 at 5:50 PM, Rajiv Kurian wrote: > I've updated Kafka-3159 with my findings. > > Thanks, > Rajiv > > On Thu, Feb 4, 2016 at 10:25 PM, Rajiv Kurian wrote: > > > I think I found out when the problem happens. When a broker

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
I've updated Kafka-3159 with my findings. Thanks, Rajiv On Thu, Feb 4, 2016 at 10:25 PM, Rajiv Kurian wrote: > I think I found out when the problem happens. When a broker that is sent a > fetch request has no messages for any of the partitions it is being asked > messages for, it returns immedi

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
I think I found out when the problem happens. When a broker that is sent a fetch request has no messages for any of the partitions it is being asked messages for, it returns immediately instead of waiting out the poll period. Both the kafka 0.9 consumer and my own hand written consumer suffer the s

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
I actually restarted my application with the consumer config I mentioned at https://issues.apache.org/jira/browse/KAFKA-3159 and I can't get it to use high CPU any more :( Not quite sure about how to proceed. I'll try to shut down all producers and let the logs age out to see if the problem happens

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
Hey Jason, Yes I checked for error codes. There were none. The message was perfectly legal as parsed by my hand written parser. I also verified the size of the response which was exactly the size of a response with an empty message set per partition. The topic has 128 partitions and has a retenti

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Jason Gustafson
Hey Rajiv, Just to be clear, when you received the empty fetch response, did you check the error codes? It would help to also include some more information (such as broker and topic settings). If you can come up with a way to reproduce it, that will help immensely. Also, would you mind updating K

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
Indeed this seems to be the case. I am now running the client mentioned in https://issues.apache.org/jira/browse/KAFKA-3159 and it is no longer taking up high CPU. The high number of EOF exceptions are also gone. It is performing very well now. I can't understand if the improvement is because of m

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
And just like that it stopped happening even though I didn't change any of my code. I had filed https://issues.apache.org/jira/browse/KAFKA-3159 where the stock 0.9 kafka consumer was using very high CPU and seeing a lot of EOFExceptions on the same topic and partition. I wonder if it was hitting t