Hi,
For the first time I'm trying to set up a standalone cluster. My current
configuration
4 server (1 jobmanger and 3 taskmanager)
a) starting the cluster
swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host sb-ust1.
Starting taskexecutor daemon on host sb-ust2.
Starting taskexecutor daemon on host sb-ust3.
Starting taskexecutor daemon on host sb-ust4.
On the taskmanager side I get the error
2019-05-01 21:16:32,794 WARN
akka.remote.ReliableDeliverySupervisor -
Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has
failed, address is now gated for [50] ms. Reason: [class [B cannot be
cast to class [C ([B and [C are in module java.base of loader 'bootstrap')]
2019-05-01 21:16:41,932 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor - Could
not resolve ResourceManager address
akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
10000 ms: Ask timed out on
[ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
message of type "akka.actor.Identify"..
2019-05-01 21:17:01,960 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor - Could
not resolve ResourceManager address
akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
10000 ms: Ask timed out on
[ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
message of type "akka.actor.Identify"..
port 6123 is allowed on the jobmanager but I haven't created a
specialized flink - user.
- Is this necessary? if yes, is it possible to define another user for
communication purposes?
I followed the documentation to setup a ssl based communication
(https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes)
and created a keystore as described:
keytool -genkeypair -alias swissbib.internal -keystore internal.keystore
-dname "CN=flink.internal" -storepass verysecret -keypass verysecret
-keyalg RSA -keysize 4096
and deployed the flink-conf.yaml on the whole cluster
(part of flink-conf.yaml)
security.ssl.internal.enabled: true
security.ssl.internal.keystore:
/swissbib_index/apps/flink/conf/internal.keystore
security.ssl.internal.truststore:
/swissbib_index/apps/flink/conf/internal.keystore
security.ssl.internal.keystore-password: verysecret
security.ssl.internal.truststore-password: verysecret
security.ssl.internal.key-password: verysecret
but this doesn't solve the problem - still no connection between
task-managers and job-managers.
- another question: which ports have to be enabled in the firewall for a
standalone cluster?
Thanks for any hints!
Günter