There is no shuffle operation in my flow. Mine actually looks like this: Source: Custom Source -> Flat Map -> (Filter -> Flat Map -> Map -> Map -> Map, Filter)
Maybe it’s treating this whole flow as one pipeline and assigning it to a slot. What I really wanted was to have the custom source I built to have running instances on all nodes. I’m not really sure if that’s the right approach, but if we could add this as a feature that’d be great, since having more than one node running the same pipeline guarantees the pipeline is never offline. -Ali On 2015-12-02, 4:39 AM, "Till Rohrmann" <trohrm...@apache.org> wrote: >If I'm not mistaken, then the scheduler has already a preference to spread >independent pipelines out across the cluster. At least he uses a queue of >instances from which it pops the first element if it allocates a new slot. >This instance is then appended to the queue again, if it has some >resources >(slots) left. > >I would assume that you have a shuffle operation involved in your job such >that it makes sense for the scheduler to deploy all pipelines to the same >machine. > >Cheers, >Till >On Dec 1, 2015 4:01 PM, "Stephan Ewen" <se...@apache.org> wrote: > >> Slots are like "resource groups" which execute entire pipelines. They >> frequently have more than one operator. >> >> What you can try as a workaround is decrease the number of slots per >> machine to cause the operators to be spread across more machines. >> >> If this is a crucial issue for your use case, it should be simple to >>add a >> "preference to spread out" to the scheduler... >> >> On Tue, Dec 1, 2015 at 3:26 PM, Kashmar, Ali <ali.kash...@emc.com> >>wrote: >> >> > Is there a way to make a task cluster-parallelizable? I.e. Make sure >>the >> > parallel instances of the task are distributed across the cluster. >>When I >> > run my flink job with a parallelism of 16, all the parallel tasks are >> > assigned to the first task manager. >> > >> > - Ali >> > >> > On 2015-11-30, 2:18 PM, "Ufuk Celebi" <u...@apache.org> wrote: >> > >> > > >> > >> On 30 Nov 2015, at 17:47, Kashmar, Ali <ali.kash...@emc.com> wrote: >> > >> Do the parallel instances of each task get distributed across the >> > >>cluster or is it possible that they all run on the same node? >> > > >> > >Yes, slots are requested from all nodes of the cluster. But keep in >>mind >> > >that multiple tasks (forming a local pipeline) can be scheduled to >>the >> > >same slot (1 slot can hold many tasks). >> > > >> > >Have you seen this? >> > > >> > >> >>https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/jo >>b >> > >_scheduling.html >> > > >> > >> If they can all run on the same node, what happens when that node >> > >>crashes? Does the job manager recreate them using the remaining open >> > >>slots? >> > > >> > >What happens: The job manager tries to restart the program with the >>same >> > >parallelism. Thus if you have enough free slots available in your >> > >cluster, this works smoothly (so yes, the remaining/available slots >>are >> > >used) >> > > >> > >With a YARN cluster the task manager containers are restarted >> > >automatically. In standalone mode, you have to take care of this >> yourself. >> > > >> > > >> > >Does this help? >> > > >> > > Ufuk >> > > >> > >> > >>