Re: Question about Scheduling of Batch Jobs

2017-01-04 Thread Konstantin Knauf
Hi Fabian, I see, thank's for the quick explanation. Cheers, Konstantin On 04.01.2017 14:15, Fabian Hueske wrote: > Hi Konstantin, > > the DataSet API tries to execute all operators as soon as possible. > > I assume that in your case, Flink does not do this because it tries to > avoid a dead

Re: Question about Scheduling of Batch Jobs

2017-01-04 Thread Fabian Hueske
Hi Konstantin, the DataSet API tries to execute all operators as soon as possible. I assume that in your case, Flink does not do this because it tries to avoid a deadlock. A dataflow which replicates data from the same source and joins it again might get deadlocked because all pipelines need to m

Question about Scheduling of Batch Jobs

2017-01-04 Thread Konstantin Knauf
Hi everyone, I have a basic question regarding scheduling of batch programs. Let's take the following graph: -> Group Combine -> ... / Source > Group Combine -> ... \ -> Map -> ... So, a source and followed by three operators with ship strategy "Forward" a