I'm trying to set up a 3 node Flink cluster (version 1.9) on the following machines:
Node 1 (Master) : 4 GB (3.8 GB) Core2 Duo 2.80GHz, Ubuntu 16.04 LTS Node 2 (Slave) : 16 GB, Core i7-3.40GHz, Ubuntu 16.04 LTS Node 3 (Slave) : 16 GB, Core i7-3,40GHz, Ubuntu 16.04 LTS I have followed the instructions on: https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/cluster_setup.html I have defined the IP/address of "jobmanager.rpc.address" in conf/flink-conf.yaml in the follwoing format: master@master-node1-hostname Slaves as conf/slaves: slave@slave-node2-hostname slave@slave-node3-hostname master@master-node1-hostname (using master machine for task execution too) However my problem is when running bin/start-cluster.sh on Master node, it fails to start taskexecutor daemon on* both Slave nodes.* It only starts both taskexecutor daemon and standalonesession daemon on master@master-node1-hostname (Node 1) I have tried both passwordless ssh and password ssh on all machines but the result is the same. In the latter case, it does ask for slave@slave-node2-hostname, slave@slave-node3-hostname passowords but fails to display any message like "starting taskexecutor daemon on xxxx" after that. I switched my master node to Node 2 and set Node 1 to slave. It was able to start taskexecutor daemons on* both Node 2 and Node 3 *successfully but did nothing for Node 1. I'd appreciate if you can advice on what the problem here could be and how I can resolve it. Best Regards, Komal