Thanks Piotr.

I didn't realize that the email attachment isn't working so the example I
was referring to was this figure from Flink website:
https://ci.apache.org/projects/flink/flink-docs-stable/fig/slot_sharing.svg

So I try to run multiple jobs concurrently in a cluster -- the jobs are
identical and the DAG looks very similar to the one in the figure. Each
machine holds one map task from each job. I end up with X number of sinks
on machine 1 (X being the number of jobs). I assume this is caused by the
operator chaining (so that all sinks are chained to mapper 1 all end up on
machine 1). But I also tried disabling chaining but I still get the same
result. Some how even when the sink and the map belongs to different
threads they are still placed in the same slot.

My goal was to see whether it is possible to have sinks evenly distributed
across the cluster (instead of all on machine 1). One way to do this is to
see if it is ok to chained the sink to one of the other mapper -- the other
way is to see if we can change the placement of the mapper altogether (like
placing map 1 of job 2 on machine 2, map 1 of job 3 on machine 3 so we end
up with sinks sit evenly throughout the cluster).

Thanks.

Le

On Mon, Mar 4, 2019 at 6:49 AM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> Are you asking the question if that’s the behaviour or you have actually
> observed this issue? I’m not entirely sure, but I would guess that the Sink
> tasks would be distributed randomly across the cluster, but maybe I’m
> mixing this issue with resource allocations for Task Managers. Maybe Till
> will know something more about this?
>
> One thing that might have solve/workaround the issue is to run those jobs
> in the job mode (one cluster per job), not in cluster mode, since
> containers for Task Managers are created/requested randomly.
>
> Piotrek
>
> On 2 Mar 2019, at 23:53, Le Xu <sharonx...@gmail.com> wrote:
>
> Hello!
>
> I'm trying to find out if there a way to force task slot sharing within a
> job. The example on the website looks like the following (as in the
> screenshot)
>
> <image.png>
> In this example, the single sink is slot-sharing with source/map (1) and
> window operator (1). If I deploy multiple identical jobs shown above, all
> sink operators would be placed on the first machine (which creates an
> unbalanced scenario). Is there a way to avoid this situation (i.e., to have
> sink operators of different jobs spread evenly across the task slots for
> the entire cluster). Specifically, I was wondering if either of the
> following options are possible:
> 1. To force Sink[1] to be slot sharing with mapper from a different
> partition on other slots such as (source[2] and window[2]).
> 2. If option 1 is not possible, is there a "hacky" way for Flink to deploy
> jobs starting from a different machine: e.g. For job 2, it can allocate
> source/map[1], window[1], sink[1] to machine 2 instead of again on machine
> 1. In this way the slot-sharing groups are still the same, but we end up
> having sinks from the two jobs on different machines.
>
>
> Thanks!
>
>
>
>
>
>

Reply via email to