Hi Jins George, This has been asked before [1]. The bottom line is that you currently cannot pre-allocate TMs and distribute your tasks evenly. You might be able to achieve a better distribution across hosts by configuring fewer slots in your TMs.
Best, Gary [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-1-5-job-distribution-over-cluster-nodes-td21588.html On Wed, Feb 13, 2019 at 6:20 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > Hi, > > I'm forwarding this question to Gary (CC'ed), who most likely would have > an answer for your question here. > > Cheers, > Gordon > > On Wed, Feb 13, 2019 at 8:33 AM Jins George <jins.geo...@aeris.net> wrote: > >> Hello community, >> >> I am trying to upgrade a Flink Yarn session cluster running BEAM >> pipelines from version 1.2.0 to 1.6.3. >> >> Here is my session start command: yarn-session.sh -d *-n 4* -jm 1024 >> -tm 3072 *-s 7* >> >> Because of the dynamic resource allocation, no taskmanager gets created >> initially. Now once I submit a job with parallelism 5, I see that 1 >> task-manager gets created and all 5 parallel instances are scheduled on the >> same taskmanager( because I have 7 slots). This can create hot spot as >> only one physical node ( out of 4 in my case) is utilized for processing. >> >> I noticed the legacy mode, which would provision all task managers at >> cluster creation, but since legacy mode is expected to go away soon, I >> didn't want to try that route. >> >> Is there a way I can configure the multiple jobs or parallel instances of >> same job spread across all the available Yarn nodes and continue using the >> 'new' mode ? >> >> Thanks, >> >> Jins George >> >