Re: streaming job reading from kafka stuck while cancelling

2016-03-10 Thread Ufuk Celebi
Hey Maciek! I'm working on the other proposed fix by closing the buffer pool early. I expect the fix to make it into the next bugfix release 1.0.1 (or 1.0.2 if 1.0.1 comes very soon). – Ufuk

Re: streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Stephan Ewen
The reason that the consumer thread is not interrupted (which is the reason why there is a separate consumer thread in the first place) is that Kafka has a bug (or design issue) where thread interrupting may lead to a deadlock in the thread. Interrupting the thread would need to make sure that int

Re: streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Maciek Próchniak
Thanks, that makes sense... Guess I'll try some dirty workaround for now by interrupting consumer thread if it's doesn't finish after some time... maciek On 09/03/2016 14:42, Stephan Ewen wrote: Here is the Jira issue: https://issues.apache.org/jira/browse/FLINK-3595 On Wed, Mar 9, 2016 at

Re: streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Stephan Ewen
Here is the Jira issue: https://issues.apache.org/jira/browse/FLINK-3595 On Wed, Mar 9, 2016 at 2:06 PM, Stephan Ewen wrote: > Hi! > > Thanks for the debugging this, I think there is in fact an issue in the > 0.9 consumer. > > I'll open a ticket for it, will try to fix that as soon as possible..

Re: streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Stephan Ewen
Hi! Thanks for the debugging this, I think there is in fact an issue in the 0.9 consumer. I'll open a ticket for it, will try to fix that as soon as possible... Stephan On Wed, Mar 9, 2016 at 1:59 PM, Maciek Próchniak wrote: > Hi, > > from time to time when we cancel streaming jobs (or they

streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Maciek Próchniak
Hi, from time to time when we cancel streaming jobs (or they are failing for some reason) we encounter: 2016-03-09 10:25:29,799 [Canceler for Source: read objects from topic: (...) ' did not react to cancelling signal, but is stuck in method: java.lang.Object.wait(Native Method) java.lang.T