Hi all,

I’m experimenting with using my own implementation of HA services instead of 
ZooKeeper that would persist JobManager information on a Kubernetes volume 
instead of in ZooKeeper.

I’ve set the high-availability option in flink-conf.yaml to the FQN of my 
factory class, and started the docker ensemble as I usually do (i.e. with no 
special “cluster” arguments or scripts.)

What’s happening now is that TaskManager is unable to connect to 
ResourceManager, because it seems it’s trying to use the /user/jobmanager path 
instead of /user/resourcemanager.

Here’s what I found in the logs:


jobmanager_1    | 2019-08-22 00:05:00,963 INFO  akka.remote.Remoting            
                              - Remoting started; listening on addresses 
:[akka.tcp://flink@jobmanager:6123]
jobmanager_1    | 2019-08-22 00:05:00,975 INFO  
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system 
started at akka.tcp://flink@jobmanager:6123 <akka.tcp://flink@jobmanager:6123>

jobmanager_1    | 2019-08-22 00:05:02,380 INFO  
org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC 
endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager 
at akka://flink/user/resourcemanager .

jobmanager_1    | 2019-08-22 00:05:03,138 INFO  
org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC 
endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at 
akka://flink/user/dispatcher .

jobmanager_1    | 2019-08-22 00:05:03,211 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - 
ResourceManager akka.tcp://flink@jobmanager:6123/user/resourcemanager was 
granted leadership with fencing token 00000000000000000000000000000000

jobmanager_1    | 2019-08-22 00:05:03,292 INFO  
org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Dispatcher 
akka.tcp://flink@jobmanager:6123/user/dispatcher was granted leadership with 
fencing token 00000000-0000-0000-0000-000000000000

taskmanager_1   | 2019-08-22 00:05:03,713 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting to 
ResourceManager 
akka.tcp://flink@jobmanager:6123/user/jobmanager(00000000000000000000000000000000).
taskmanager_1   | 2019-08-22 00:05:04,137 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
resolve ResourceManager address 
akka.tcp://flink@jobmanager:6123/user/jobmanager, retrying in 10000 ms: Could 
not connect to rpc endpoint under address 
akka.tcp://flink@jobmanager:6123/user/jobmanager..

Is this a known bug? I’d appreciate any help I can get.

Thanks,
Aleksandar Mastilovic

Reply via email to