I have questions related to REST API in the case of ZooKeeper HA and a standalone cluster. But I think the questions apply to other setups too such as YARN.
Let's assume a standalone cluster with multiple JobManagers. The JobManagers elect the leader among themselves and register that to ZooKeeper. When using the Flink command line, AFAIK the code will go to ZooKeeper to find the host and port of the leading JobManager and send HTTP requests there. My question is: when accessing the REST API directly (e.g. curl) does one need to call the leading JobManager or will any up and running JobManager do? And if the leader needs to be called, why is it so? Behind the scenes the REST API will connect to the leading "JobManager" over RPC, making it irrelevant which JobManager receives the HTTP request. By experimenting, I found the Web UI works fine if all the JobManagers are behind a load balancer and leading and standby JobManagers are called. The only issue I found was that when a jar is submitted (/jars/upload), it is stored on the local disk of the JobManager that happens to handle that request. As a consequence, creating a job from that jar only succeeds if the HTTP request hits the JobManager that has the file. There might be a "hack" to overcome this limitation, set web.upload.dir to be in S3 / GCS or elsewhere accessible by all JobManagers. I didn't try this. Or in the case of uploading jars and creating jobs, ensure the same JobManager is called (bypass loadbalancer). But I wonder if there's something else why the leading JM should be called. A follow-up question arises. If the jars are stored only on the leading JobManager, doesn't that mean that if the leader changes, the new leader is not aware of the jars uploaded to the old leader? From the REST API's perspective this means that even in the JobManager HA setup and when always calling the leader, a simple "upload a jar and a deploy a job"-cycle is not guaranteed to work if the leader happens to change between the requests. Did I miss something? -- Regards, Juha