Re: Utilising EMR's master node

2018-10-06 Thread Averell
Hi Gary, Thanks for the information. I didn't know that -yn is obsolete :( I am using Flink 1.6. Not sure whether that's a bug when I tried to set -yn explicitly, but I started only 1 cluster. Thanks and regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.na

Re: Utilising EMR's master node

2018-10-06 Thread Gary Yao
Hi Averell, It is up to the YARN scheduler on which hosts the containers are started. What Flink version are you using? I assume you are using 1.4 or earlier because you are specifying a fixed number of TMs. If you launch Flink with -yn 2, you should be only seeing 2 TMs in total (not 4). Are you

Re: Utilising EMR's master node

2018-09-26 Thread Averell
Thank you Gary. Regarding your previous suggestion to to change the configuration regarding to the number of vcores on the EMR master node, I tried and found one funny/bad behaviour as following: * hardware onfiguration: master node: 4vcores + 8GB ram, 2x executors with 16vcores + 32GB ram each.

Re: Utilising EMR's master node

2018-09-26 Thread Gary Yao
Hi Averell, There is no general answer to your question. If you are running more TMs, you get better isolation between different Flink jobs because one TM is backed by one JVM [1]. However, every TMs brings additional overhead (heartbeating, running more threads, etc.) [1]. It also depends on the

Re: Utilising EMR's master node

2018-09-19 Thread Averell
Hi Gary, Thanks for your help. Regarding TM configurations, in term of performance, when my 2 servers have 16 vcores each, should I have 2 TMs with 16GB mem, 16 task slots each, or 8 TMs with 4GB mem and 4 task slots each? Thanks and regards, Averell -- Sent from: http://apache-flink-user-mail

Re: Utilising EMR's master node

2018-09-18 Thread Gary Yao
Hi Averell, Flink compares the number of user selected vcores to the vcores configured in the yarn-site.xml of the submitting node, i.e., in your case the master node. If there are not enough configured vcores, the client throws an exception. This behavior is not ideal and I found an old JIRA tick

Re: Utilising EMR's master node

2018-09-17 Thread Averell
Thank you Gary. Regarding the option to use a smaller server for the master node, when starting a flink job, I would get an error like the following; /Caused by: org.apache.flink.configuration.IllegalConfigurationException: *The number of virtual cores per node were configured with 16 but Yarn on

Re: Utilising EMR's master node

2018-09-17 Thread Gary Yao
Hi Averell, According to the AWS documentation [1], the master node only runs the YARN ResourceManager and the HDFS NameNode. Containers can only by launched on nodes that are running the YARN NodeManager [2]. Therefore, if you want TMs or JMs to be launched on your EMR master node, you have to st

Utilising EMR's master node

2018-09-16 Thread Averell
Hello everyone, I'm trying to run Flink on AWS EMR following the guides from Flink doc and from AWS