Flink 1.1.3 | Shutting down YarnClusterClient from the client shutdown hook | happening frequently

2016-11-09 Thread Anchit Jatana
Hi All, I'm running my flink application on YARN. It's frequently getting suspended, though gracefully. Below is the snippet of the error, attaching full jobmanager log to help debug. Please help me identify the cause and resolve the issue. Thank you Regards, Anchit Error snippet: 2016-11-09 0

Re: Why did the Flink Cluster JM crash?

2016-11-09 Thread amir bahmanyari
Thanks Till.I have been trying out many many configuration combinations to get to the peak of what I can get as a reasonable performance.And yes, when I drop the number of slots, I dont get OOM. However, I dont get the response I want either.The amount of data I send is kinda huge; about 105 G t

Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-09 Thread Konstantin Knauf
Hi Aljoscha, as it turns out the "workaround" I was thinking was functionally working, but had a so to say memory leak. I was under the impression that evicted elements will be removed from the window state... Anyway, I think that this (triggers not being evaluated when the window state is null)

Processing streams of events with unpredictable delays

2016-11-09 Thread PedroMrChaves
Hello, I have a stream source of events. Each event is assigned a timestamp by the machine that generated the event and then those events are retreived by other machines (collectors). Finally those collectors will send the events to flink. In flink, when I receive those events I extract their time

Re: An idea for a parallel AllWindowedStream

2016-11-09 Thread Aljoscha Krettek
Hi, yes, this works well in cases and I was also thinking about adding something like this to Flink. There can be problems if you use a trigger other than EventTimeTrigger that possibly fires multiple times or if you specify an allowed lateness. In those cases, you would overcount elements in the

[DISCUSS] Changing Window Cleanup Semantics

2016-11-09 Thread Aljoscha Krettek
Hi, I recently created https://issues.apache.org/jira/browse/FLINK-4994 to address what I think is a flaw in the window cleanup semantics. This has the possibility of affecting people so I'd like to get some opinions and also give people a heads-up. Before going into what I'm proposing in the issu

Re: Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-09 Thread Aljoscha Krettek
Could you go into some detail of why you need to keep the trigger state? Just the basics because you probably cannot (should not) talk about your internal stuff. On Wed, 9 Nov 2016 at 13:16 Konstantin Knauf wrote: > Sounds good Aljoscha. > > sent from my phone. Plz excuse brevity and tpyos. > -

Re: Last event of each window belongs to the next window - Wrong

2016-11-09 Thread Aljoscha Krettek
Hi Samir, can events with the same user ID originate from different sources? If yes, then doing things based on changes in the user idea are problematic because there are no ordering guarantees. Cheers, Aljoscha On Tue, 8 Nov 2016 at 19:59 Samir Abdou wrote: > Hi Aljoscha, > > Thanks for the qu

AW: Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-09 Thread Konstantin Knauf
Sounds good Aljoscha. sent from my phone. Plz excuse brevity and tpyos. --- Konstantin Knauf *konstantin.kn...@tngtech.com * +49-174-3413182 TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke Aljoscha Krettek s

Re: Window PURGE Behaviour 1.0.2 vs 1.1.3

2016-11-09 Thread Aljoscha Krettek
Hi, exactly for this case I want to make a change to when Trigger.clear() is called: https://issues.apache.org/jira/browse/FLINK-4994 Right now, clear is called when the window is being garbage collected because we passed the allowed lateness (after this, nothing will ever be added to a window aga

Re: Why did the Flink Cluster JM crash?

2016-11-09 Thread Till Rohrmann
Hi Amir, I fear that 900 slots per task manager is a bit too many unless your machine has 900 cores. As a rule of thumb you should allocate as many slots as your machines have cores. Maybe you could try to decrease the number of slots and see if you still observe an OOM error. Cheers, Till On We