> (c) You have two operators with the same name that become tasks with the same name.
Actually it was a variation on that issue. The problem was that I was reading a dataset X which was part of both the dynamic and the static path of a Flink iteration. I guess the duplicates duplicates these paths, so I had two DataSource tasks with the same location. Problem solved. Regards, A. 2016-05-31 13:54 GMT+02:00 Stephan Ewen <se...@apache.org>: > It could be that > > (a) The task failed and was restarted. > > (b) The program has multiple steps (collect() print()), so that parts of > the graph get re-executed. > > (c) You have two operators with the same name that become tasks with the > same name. > > Do any of those explanations make sense in your setting? > > Stephan > > > On Tue, May 31, 2016 at 12:48 PM, Alexander Alexandrov < > alexander.s.alexand...@gmail.com> wrote: > > > Sure, you can find them attached here (both jobmanager and taskmanager, > > the problem was observed in the jobmanager logs). > > > > If needed I can also share the binary to reproduce the issue. > > > > I think the problem is related to the fact that the input splits are > > lazily assigned to the task slots, and it seems that in case of 8 splits > > for 4 slots, we get each (x/y) combination twice. > > > > Moreover, I am currently analyzing the structure of the log files, and it > > seems that the task ID is not reported consistently across the different > > messages [1,2,3]. This makes the implementation of an ETL job that > extracts > > the statistics from the log and feed them into a database quite hard. > > > > Would it be possible to push a fix which adds the task ID consistently > > across all messages in the 1.0.x line? If yes, I will open a JIRA and > work > > on that this week. > > I would like to get feedback from other people who are parsing jobmanager > > / taskamanager logs on that in order to avoid possible backwards > > compatibility with job analysis tools on the release line. > > > > [1] > > > https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L370-L371 > > [2] > > > https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L991-L992 > > > > Regards, > > A. > > > > > > 2016-05-31 12:01 GMT+02:00 Ufuk Celebi <u...@apache.org>: > > > >> On Tue, May 31, 2016 at 11:53 AM, Alexander Alexandrov > >> <alexander.s.alexand...@gmail.com> wrote: > >> > Can somebody shed a light on the execution semantics of the scheduler > >> which > >> > will explain this behavior? > >> > >> The execution IDs are unique per execution attempt. Having two tasks > >> with the same subtask index running at the same time is unexpected. > >> > >> Can you share the complete logs, please? > >> > >> – Ufuk > >> > > > > >