subject:"Re\: Frequent exceptions killing streaming job"

Re: Frequent exceptions killing streaming job

2016-02-26 Thread Nick Dimiduk

Sorry I wasn't clear. No, the lock contention is not in Flink. On Friday, February 26, 2016, Stephan Ewen wrote: > Was the contended lock part of Flink's runtime, or the application code? > If it was part of the Flink Runtime, can you share what you found? > > On Thu, Feb 25, 2016 at 6:03 PM, Ni

Re: Frequent exceptions killing streaming job

2016-02-26 Thread Stephan Ewen

Was the contended lock part of Flink's runtime, or the application code? If it was part of the Flink Runtime, can you share what you found? On Thu, Feb 25, 2016 at 6:03 PM, Nick Dimiduk wrote: > For what it's worth, I dug into the TM logs and found that this exception > was not the root cause, m

Re: Frequent exceptions killing streaming job

2016-02-25 Thread Nick Dimiduk

For what it's worth, I dug into the TM logs and found that this exception was not the root cause, merely a symptom of other backpressure building in the flow (actually, lock contention in another part of the stack). While Flink was helpful in finding and bubbling up this stack to the UI, it was ult

Re: Frequent exceptions killing streaming job

2016-01-20 Thread Robert Metzger

Hey Nick, I had a discussion with Stephan Ewen on how we could resolve the issue. I filed a JIRA with our suggested approach: https://issues.apache.org/jira/browse/FLINK-3264 By handling this directly in the KafkaConsumer, we would avoid fetching data we can not handle anyways (discarding in the

Re: Frequent exceptions killing streaming job

2016-01-17 Thread Nick Dimiduk

On Sunday, January 17, 2016, Stephan Ewen wrote: > I agree, real time streams should never go down. > Glad to hear that :) > [snip] Both should be supported. > Agreed. > Since we interpret streaming very broadly (also including analysis of > historic streams or timely data), the "backpress

Re: Frequent exceptions killing streaming job

2016-01-17 Thread Stephan Ewen

Hi Nick! I agree, real time streams should never go down. Whether you want to allow the stream processor to temporarily fall behind (back pressure on an event spike) and catch up a bit later, or whether you want to be always at the edge of real time and drop messages, is use case specific. Both s

Re: Frequent exceptions killing streaming job

2016-01-16 Thread Nick Dimiduk

This goes back to the idea that streaming applications should never go down. I'd much rather consume at max capacity and knowingly drop some portion of the incoming pipe than have the streaming job crash. Of course, once the job itself is robust, I still need the runtime to be robust -- YARN vs (po

Re: Frequent exceptions killing streaming job

2016-01-16 Thread Stephan Ewen

@Robert: Is it possible to add a "fallback" strategy to the consumer? Something like "if offsets cannot be found, use latest"? I would make this an optional feature to activate. I would think it is quite surprising to users if records start being skipped in certain situations. But I can see that t

Re: Frequent exceptions killing streaming job

2016-01-16 Thread Robert Metzger

Hi Nick, I'm sorry you ran into the issue. Is it possible that Flink's Kafka consumer falls back in the topic so far that the offsets it's requesting are invalid? For that, the retention time of Kafka has to be pretty short. Skipping records under load is something currently not supported by Fli

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

Re: Frequent exceptions killing streaming job

9 matches

Site Navigation

Mail list logo

Footer information