You've pretty much answered the question yourself. *thumbs up*
For the vast majority of cases you can call any JobManager.
The exceptions are jar operations (because they are persisted in the
JM-local filesystem, and other JMs don't know about them) and triggering
savepoints (because metadata for on-going savepoint operations (i.e.,
the information returned when querying the savepoint operation status)
is also kept locally in the JM).
This does indeed imply that on JM failover all this information is lost.
There are ideas to solve is, but no concrete timeline. See
https://issues.apache.org/jira/browse/FLINK-18312
On 18/08/2021 11:54, Juha Mynttinen wrote:
I have questions related to REST API in the case of ZooKeeper HA and a
standalone cluster. But I think the questions apply to other setups
too such as YARN.
Let's assume a standalone cluster with multiple JobManagers. The
JobManagers elect the leader among themselves and register that to
ZooKeeper. When using the Flink command line, AFAIK the code will go
to ZooKeeper to find the host and port of the leading JobManager and
send HTTP requests there.
My question is: when accessing the REST API directly (e.g. curl) does
one need to call the leading JobManager or will any up and
running JobManager do? And if the leader needs to be called, why is it so?
Behind the scenes the REST API will connect to the leading
"JobManager" over RPC, making it irrelevant which JobManager receives
the HTTP request.
By experimenting, I found the Web UI works fine if all the JobManagers
are behind a load balancer and leading and standby JobManagers are
called. The only issue I found was that when a jar is submitted
(/jars/upload), it is stored on the local disk of the JobManager that
happens to handle that request. As a consequence, creating a job from
that jar only succeeds if the HTTP request hits the JobManager that
has the file. There might be a "hack" to overcome this limitation, set
web.upload.dir to be in S3 / GCS or elsewhere accessible by all
JobManagers. I didn't try this. Or in the case of uploading jars and
creating jobs, ensure the same JobManager is called (bypass loadbalancer).
But I wonder if there's something else why the leading JM should be
called.
A follow-up question arises. If the jars are stored only on the
leading JobManager, doesn't that mean that if the leader changes, the
new leader is not aware of the jars uploaded to the old leader? From
the REST API's perspective this means that even in the JobManager HA
setup and when always calling the leader, a simple "upload a jar and a
deploy a job"-cycle is not guaranteed to work if the leader happens to
change between the requests. Did I miss something?
--
Regards,
Juha