Re: JobManager shows TaskManager was lost/killed while TaskManger Process is still running and the network is OK.

2017-11-09 Thread Rahul Raj
HI All, Even I am facing the same issue. My code fails after running for 15 hours throwing same "Task Manager lost/killed exception". Can we please know the possible solution in detail for this? Rahul Raj On 15 September 2017 at 23:06, AndreaKinn wrote: > Hi, sorry for re-vive this old convers

Testing / Configuring event windows with Table API and SQL

2017-11-09 Thread Colin Williams
Hello, I've been given some flink application code and asked to implement and ensure that our query is updated for late arriving entries. We're currently creating a table using a Tumbling SQL query similar to the first example in https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/ta

Re: Docker-Flink Project: TaskManagers can't talk to JobManager if they are on different nodes

2017-11-09 Thread Vergilio, Thalita
Hi Till, I have made some progress with the name resolution for machines that are not in the same subnet. The problem I am facing now is Flink-specific, so I wonder if you could help me. It is all running fine in a multi-cloud setup with the jobmanager in Azure and the taskmanager in the Goo

Re: Generate watermarks per key in a KeyedStream

2017-11-09 Thread Derek VerLee
We are contending with the same issue, as it happens.  We have dozens, and potentially down the line, may need to deal with thousands of different "time systems" as you put it, and may not be know at compile time or job start time.  In a practical sense, how could

Re: Do timestamps and watermarks exist after window evaluation?

2017-11-09 Thread Derek VerLee
This new documentation seems to answer my question directly.  It's good to know my intuitions where not wildly off.  Also thank you for continuing to improve the already good documentation. Funny enough, some of the other questions I have, where also asked by other

Re: Correlation between data streams/operators and threads

2017-11-09 Thread Shailesh Jain
On 1. - is it tied specifically to the number of source operators or to the number of Datastream objects created. I mean does the answer change if I read all the data from a single Kafka topic, get a Datastream of all events, and the apply N filters to create N individual streams? On 3. - the prob

Re: Unsubscribe

2017-11-09 Thread Gary Yao
Hi Paolo, If you haven't done so already, you need to write to user-unsubscr...@flink.apache.org to unsubscribe. Best, Gary On Thu, Nov 9, 2017 at 5:28 PM, Paolo Cristofanelli < cristofanelli.pa...@gmail.com> wrote: > Hi, > I would like to unsubscribe from the mailing list. > > Best, > P

Re: Flink memory leak

2017-11-09 Thread Piotr Nowojski
Hi, Could you attach full logs from those task managers? At first glance I don’t see a connection between those exceptions and any memory issue that you might had. It looks like a dependency issue in one (some? All?) of your jobs. Did you build your jars with -Pbuild-jar profile as described he

Streaming : a way to "key by partition id" without redispatching data

2017-11-09 Thread Gwenhael Pasquiers
Hello, (Flink 1.2.1) For performances reasons I'm trying to reduce the volume of data of my stream as soon as possible by windowing/folding it for 15 minutes before continuing to the rest of the chain that contains keyBys and windows that will transfer data everywhere. Because of the huge vol

Re: Flink memory leak

2017-11-09 Thread ÇETİNKAYA EBRU ÇETİNKAYA EBRU
On 2017-11-08 18:30, Piotr Nowojski wrote: Btw, Ebru: I don’t agree that the main suspect is NetworkBufferPool. On your screenshots it’s memory consumption was reasonable and stable: 596MB -> 602MB -> 597MB. PoolThreadCache memory usage ~120MB is also reasonable. Do you experience any problems

Unsubscribe

2017-11-09 Thread Paolo Cristofanelli
Hi, I would like to unsubscribe from the mailing list. Best, Paolo

Re: How to best create a bounded session window ?

2017-11-09 Thread Piotr Nowojski
Indeed you are unfortunately right. Triggers do not define/control lifecycle of the window, so it could happen that each new event is constantly pushing the leading boundary of the window, while your custom trigger is constantly triggering and purging this single EVENT (because exceeded max wind

Re: Correlation between data streams/operators and threads

2017-11-09 Thread Piotr Nowojski
Hi, 1. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/parallel.html Number of threads executing would be roughly speaking equal to of the number of input data streams multiplied by the parallelism.

Re: Weird performance on custom Hashjoin w.r.t. parallelism

2017-11-09 Thread Piotr Nowojski
Hi, Yes as you correctly analysed parallelism 1 was causing problems, because it meant that all of the records must been gathered over the network from all of the task managers. Keep in mind that even if you increase parallelism to “p”, every change in parallelism can slow down your application

Re: Do timestamps and watermarks exist after window evaluation?

2017-11-09 Thread Aljoscha Krettek
Hi, This new section in the windowing documentation will help answer your question: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#working-with-window-results

Re: How to best create a bounded session window ?

2017-11-09 Thread Vishal Santoshi
Thanks you for the response. I would not mind the second scenario as in a second window, which your illustration suggests with a custom trigger approach, I am not certain though that triggers define the lifecycle of a window, as in a trigger firing does not necessarily imply a Garbage Co

Re: Serialization in Operator Chaining

2017-11-09 Thread Aljoscha Krettek
Hi, If you use the DataSet API, there will be no serialisation between operations in a chain. If you use the DataStream API, there will be serialisation by default but you can disable that using executionEnv.getConfig().enableObjectReuse(). Hope that helps, Aljoscha > On 9. Nov 2017, at 13:57

Re: When using Flink for CEP, can the data in Cassandra database be used for state

2017-11-09 Thread Kostas Kloudas
Hi Shyla, Happy to hear that you are experimenting with CEP! For enriching your input stream with data from Cassandra (or whichever external storage system) you could use: * either the AsyncIO functionality offered by Flink (https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream

Re: How to best create a bounded session window ?

2017-11-09 Thread Piotr Nowojski
It might be more complicated if you want to take into account events coming in out of order. For example you limit length of window to 5 and you get the following events: 1 2 3 4 6 7 8 5 Do you want to emit windows: [1 2 3 4 5] (length limit exceeded) + [6 7 8] ? Or are you fine with interlea

Serialization in Operator Chaining

2017-11-09 Thread Hicken, Jan
Hi folks, I have a question regarding the serialization in Flink's operator chaining: Consider these two map functions: Map1 and Map2 As I haven't disabled operator chaining in the environment, these two functions will be chained into one operator when executing my job. The thing is, that the s

Re: Generate watermarks per key in a KeyedStream

2017-11-09 Thread Shailesh Jain
Thanks for your reply, Xingcan. On Wed, Nov 8, 2017 at 10:42 PM, Xingcan Cui wrote: > Hi Shailesh, > > actually, the watermarks are generated per partition, but all of them will > be forcibly aligned to the minimum one during processing. That is decided > by the semantics of watermark and KeyedS

Re: Broadcast to all the other operators

2017-11-09 Thread Ladhari Sadok
Ok thanks Tony, your answer is very helpful. 2017-11-09 11:09 GMT+01:00 Tony Wei : > Hi Sadok, > > The sample code is just an example to show you how to broadcast the rules > to all subtasks, but the output from CoFlatMap is not necessary to be > Tuple2. It depends on what you actually need in yo

Re: Broadcast to all the other operators

2017-11-09 Thread Tony Wei
Hi Sadok, The sample code is just an example to show you how to broadcast the rules to all subtasks, but the output from CoFlatMap is not necessary to be Tuple2. It depends on what you actually need in your Rule Engine project. For example, if you can apply rule on each record directly, you can em

Re: Job Manager Configuration

2017-11-09 Thread Till Rohrmann
That is the question I hope to be able to answer with the logs. Let's see what they say. Cheers, Till On Wed, Nov 8, 2017 at 7:24 PM, Chan, Regina wrote: > Thanks for the responses! > > > > I’m currently using 1.2.0 – going to bump it up once I have things > stabilized. I haven’t defined any sl

Fwd: Broadcast to all the other operators

2017-11-09 Thread Ladhari Sadok
-- Forwarded message -- From: Ladhari Sadok Date: 2017-11-09 10:26 GMT+01:00 Subject: Re: Broadcast to all the other operators To: Tony Wei Thanks Tony for your very fast answer , Yes it resolves my problem that way, but with flatMap I will get Tuple2 always in the processing f

Correlation between data streams/operators and threads

2017-11-09 Thread Shailesh Jain
Hi, I'm trying to understand the runtime aspect of Flink when dealing with multiple data streams and multiple operators per data stream. Use case: N data streams in a single flink job (each data stream representing 1 device - with different time latencies), and each of these data streams gets spl

Re: Broadcast to all the other operators

2017-11-09 Thread Tony Wei
Hi Sadok, What I mean is to keep the rules in the operator state. The event in Rule Stream is just the change log about rules. For more specific, you can fetch the rules from Redis in the open step of CoFlatMap and keep them in the operator state, then use Rule Stream to notify the CoFlatMap to 1.