The number of TM mainly depends on the parallelism and job graph. Flink now allows you to set the maximum slots number (slotmanager-number-of-slots-max[1]). There is also a plan to support setting the minimum number of slots[2].
[1] https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#slotmanager-number-of-slots-max [2] https://issues.apache.org/jira/browse/FLINK-15959 Best, Yangze Guo On Tue, Aug 18, 2020 at 12:21 PM 范超 <fanc...@mgtv.com> wrote: > > Thanks Yangze > > 1. Do you meet any problem when deploying on Yarn or running Flink job? > My job works well > > 2. Why do you need to start the TMs on all the three machines? > From cluster perspective, I wonder if the process pressure can be balance to > 3 machines. > > 3. Flink can control how many TM to start, but where to start the TMs depends > on Yarn. > Yes, the job where to start the TM is depend on Yarn. > Could you please tell me parameter controls how many TM to start, the yn > parameter is delete from 1.10 as the 1.9 doc sample list[1] below > > [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/cli.html > > Run example program using a per-job YARN cluster with 2 TaskManagers: > > ./bin/flink run -m yarn-cluster -yn 2 \ > ./examples/batch/WordCount.jar \ > --input hdfs:///user/hamlet.txt --output > hdfs:///user/wordcount_out > > -----邮件原件----- > 发件人: Yangze Guo [mailto:karma...@gmail.com] > 发送时间: 2020年8月18日 星期二 11:31 > 收件人: 范超 <fanc...@mgtv.com> > 抄送: user (user@flink.apache.org) <user@flink.apache.org> > 主题: Re: How to specify the number of TaskManagers in Yarn Cluster using > Per-Job Mode > > Hi, > > Flink can control how many TM to start, but where to start the TMs depends on > Yarn. > > Do you meet any problem when deploying on Yarn or running Flink job? > Why do you need to start the TMs on all the three machines? > > Best, > Yangze Guo > > On Tue, Aug 18, 2020 at 11:25 AM 范超 <fanc...@mgtv.com> wrote: > > > > Thanks Yangze > > The reason why I don’t deploying a standalone cluster, it's because there > > kafka, kudu, hadoop, zookeeper on these machines, maybe currently using the > > yarn to manage resources is the best choice for me. > > If Flink can not control how many tm to start , could anyone providing > > me some best practice for deploying on yarn please? I read the [1] and > > still don't very clear > > > > [1] > > https://www.ververica.com/blog/how-to-size-your-apache-flink-cluster-g > > eneral-guidelines > > > > -----邮件原件----- > > 发件人: Yangze Guo [mailto:karma...@gmail.com] > > 发送时间: 2020年8月18日 星期二 10:50 > > 收件人: 范超 <fanc...@mgtv.com> > > 抄送: user (user@flink.apache.org) <user@flink.apache.org> > > 主题: Re: How to specify the number of TaskManagers in Yarn Cluster > > using Per-Job Mode > > > > Hi, > > > > I think that is only related to the Yarn scheduling strategy. AFAIK, Flink > > could not control it. You could check the RM log to figure out why it did > > not schedule the containers to all the three machines. BTW, if you have > > specific requirements to start with all the three machines, how about > > deploying a standalone cluster instead? > > > > Best, > > Yangze Guo > > > > On Tue, Aug 18, 2020 at 10:24 AM 范超 <fanc...@mgtv.com> wrote: > > > > > > Thanks Yangze > > > > > > All 3 machines NodeManager is started. > > > > > > I just don't know why not three machines each running a Flink > > > TaskManager and how to achieve this > > > > > > -----邮件原件----- > > > 发件人: Yangze Guo [mailto:karma...@gmail.com] > > > 发送时间: 2020年8月18日 星期二 10:10 > > > 收件人: 范超 <fanc...@mgtv.com> > > > 抄送: user (user@flink.apache.org) <user@flink.apache.org> > > > 主题: Re: How to specify the number of TaskManagers in Yarn Cluster > > > using Per-Job Mode > > > > > > Hi, > > > > > > Do you start the NodeManager in all the three machines? If so, could you > > > check all the NMs correctly connect to the ResourceManager? > > > > > > Best, > > > Yangze Guo > > > > > > On Tue, Aug 18, 2020 at 10:01 AM 范超 <fanc...@mgtv.com> wrote: > > > > > > > > Hi, Dev and Users > > > > I’ve 3 machines each one is 8 cores and 16GB memory. > > > > Following it’s my Resource Manager screenshot the cluster have 36GB > > > > total. > > > > I specify the paralism to 3 or even up to 12, But the task manager is > > > > always running on two nodes not all three machine, the third node does > > > > not start the task manager. > > > > I tried set the –p –tm –jm parameters, but it always the same, only > > > > different is more container on the two maching but not all three > > > > machine start the task manager. > > > > My question is how to set the cli parameter to start all of my > > > > three machine (all task manager start on 3 machines) > > > > > > > > Thanks a lot > > > > [cid:image001.png@01D67546.62291B70] > > > > > > > > > > > > Chao fan > > > >