Re: Fixing two critical bugs in kafka streams

2017-03-07 Thread Matthias J. Sax
Thanks for explaining in more detail. Now I understood it. Was on a complete wrong track before that! Great find and thanks for fixing it :) With regard to two unrelated issues. We should have two JIRAs and two PRs. -Matthias On 3/6/17 8:48 PM, Sachin Mittal wrote: >> As for the second issue y

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal
> As for the second issue you brought up, I agree it is indeed a bug; but just to clarify it is the CREATION of the first task including restoring stores that can take longer than MAX_POLL_INTERVAL_MS_CONFIG, not processing it right Yes this is correct. I may have misused the terminology so lets n

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Matthias J. Sax
Thanks for your input. I now understood the first issue (and the fix). Still not sure about the second issue. From my understanding, the deadlock is "caused" by your fix of problem one. If the thread would die, the lock would get release and no deadlock would occur. However, because the thread do

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Guozhang Wang
Hello Sachin, Thanks for your finds!! Just to add what Damian said regarding 1), in KIP-129 where we are introducing exactly-once processing semantics to Streams we have also described different categories of error handling for exactly-once. Commit exceptions due to rebalance will be handled as "p

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal
Please find the JIRA https://issues.apache.org/jira/browse/KAFKA-4848 On Mon, Mar 6, 2017 at 5:20 PM, Damian Guy wrote: > Hi Sachin, > > If it is a bug then please file a JIRA for it, too. > > Thanks, > Damian > > On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote: > > > Ok that's great. > > So y

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Damian Guy
Hi Sachin, If it is a bug then please file a JIRA for it, too. Thanks, Damian On Mon, 6 Mar 2017 at 11:23 Sachin Mittal wrote: > Ok that's great. > So you have already fixed that issue. > > I have modified my PR to remove that change (which was done keeping > 0.10.2.0 in mind). > > However the

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Sachin Mittal
Ok that's great. So you have already fixed that issue. I have modified my PR to remove that change (which was done keeping 0.10.2.0 in mind). However the other issue is still valid. Please review that change. https://github.com/apache/kafka/pull/2642 Thanks Sachin On Mon, Mar 6, 2017 at 3:56

Re: Fixing two critical bugs in kafka streams

2017-03-06 Thread Damian Guy
On trunk the CommitFailedException isn't thrown anymore. The commitOffsets method doesn't throw an exception. It returns one if it was thrown. We used to throw this exception during suspendTasksAndState, but we don't anymore. On Mon, 6 Mar 2017 at 05:04 Sachin Mittal wrote: > Hi > On CommitFaile

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Sachin Mittal
Hi On CommitFailedException at onPartitionsRevoked if it is thrown it gets assigned to rebalanceException. This causes the stream thread to shutdown. I am not sure how we can resume the thread. Note thread is not in invalid state because because it already has been assigned new partitions and this

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Matthias J. Sax
Sachin, thanks a lot for contributing! Right now, I am not sure if I understand the change. On CommitFailedException, why can we just resume the thread? To me, it seems that the thread will be in an invalid state and thus it's not save to just swallow the exception and keep going. Can you shed so

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Sachin Mittal
Hi, Please find the new PR https://github.com/apache/kafka/pull/2642/ I see that in trunk there has been change which is different from in 10.2.0 10.2.0 if (firstException.get() == null) { firstException.set(commitOffsets()); } vs trunk if (firstException.get()

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Eno Thereska
Thanks Sachin, one thing before the review, 0.10.2 is closed now, this needs to target trunk. Thanks Eno > On 5 Mar 2017, at 09:10, Sachin Mittal wrote: > > Please review the PR and let me know if this makes sense. > > https://github.com/apache/kafka/pull/2640 > > Thanks > Sachin > > > On S

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Sachin Mittal
Please review the PR and let me know if this makes sense. https://github.com/apache/kafka/pull/2640 Thanks Sachin On Sun, Mar 5, 2017 at 1:49 PM, Eno Thereska wrote: > Thanks Sachin for your contribution. Could you create a pull request out > of the commit (so we can add comments, and also so

Re: Fixing two critical bugs in kafka streams

2017-03-05 Thread Eno Thereska
Thanks Sachin for your contribution. Could you create a pull request out of the commit (so we can add comments, and also so you are acknowledged properly for your contribution)? Thanks Eno > On 5 Mar 2017, at 07:34, Sachin Mittal wrote: > > Hi, > So far in our experiment we have encountered 2

Fixing two critical bugs in kafka streams

2017-03-04 Thread Sachin Mittal
Hi, So far in our experiment we have encountered 2 critical bugs. 1. If a thread takes more that MAX_POLL_INTERVAL_MS_CONFIG to compute a cycle it gets evicted from group and rebalance takes place and it gets new assignment. However when this thread tries to commit offsets for the revoked partition