Re: Collision of task number values for the same task

Alexander Alexandrov Tue, 31 May 2016 06:26:49 -0700

> (c) You have two operators with the same name that become tasks with the
same name.


Actually it was a variation on that issue.
The problem was that I was reading a dataset X which was part of both the
dynamic and the static path of a Flink iteration. I guess the duplicates
duplicates these paths, so I had two DataSource tasks with the same
location.

Problem solved.

Regards,
A.

2016-05-31 13:54 GMT+02:00 Stephan Ewen <se...@apache.org>:

> It could be that
>
> (a) The task failed and was restarted.
>
> (b) The program has multiple steps (collect() print()), so that parts of
> the graph get re-executed.
>
> (c) You have two operators with the same name that become tasks with the
> same name.
>
> Do any of those explanations make sense in your setting?
>
> Stephan
>
>
> On Tue, May 31, 2016 at 12:48 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > Sure, you can find them attached here (both jobmanager and taskmanager,
> > the problem was observed in the jobmanager logs).
> >
> > If needed I can also share the binary to reproduce the issue.
> >
> > I think the problem is related to the fact that the input splits are
> > lazily assigned to the task slots, and it seems that in case of 8 splits
> > for 4 slots, we get each (x/y) combination twice.
> >
> > Moreover, I am currently analyzing the structure of the log files, and it
> > seems that the task ID is not reported consistently across the different
> > messages [1,2,3]. This makes the implementation of an ETL job that
> extracts
> > the statistics from the log and feed them into a database quite hard.
> >
> > Would it be possible to push a fix which adds the task ID consistently
> > across all messages in the 1.0.x line? If yes, I will open a JIRA and
> work
> > on that this week.
> > I would like to get feedback from other people who are parsing jobmanager
> > / taskamanager logs on that in order to avoid possible backwards
> > compatibility with job analysis tools on the release line.
> >
> > [1]
> >
> https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L370-L371
> > [2]
> >
> https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L991-L992
> >
> > Regards,
> > A.
> >
> >
> > 2016-05-31 12:01 GMT+02:00 Ufuk Celebi <u...@apache.org>:
> >
> >> On Tue, May 31, 2016 at 11:53 AM, Alexander Alexandrov
> >> <alexander.s.alexand...@gmail.com> wrote:
> >> > Can somebody shed a light on the execution semantics of the scheduler
> >> which
> >> > will explain this behavior?
> >>
> >> The execution IDs are unique per execution attempt. Having two tasks
> >> with the same subtask index running at the same time is unexpected.
> >>
> >> Can you share the complete logs, please?
> >>
> >> – Ufuk
> >>
> >
> >
>

Re: Collision of task number values for the same task

Reply via email to