Hello,
I think the flink-conf.yaml should only be required on the node on which
you call yarn-session.sh.
For starting the session cluster programmatically you would have to look
into the YarnClusterDescriptor (for starting the session cluster) and
the YarnClusterClient for submitting jobs (but you get a client from the
cluster descriptor).
Do note however that these are internal API's; they may or may not be
documented, they may rely on specific behavior of the CLI and there are
no API stability guarantees.
The YARNSessionFIFOITCase may provide some hints on how to use it.
On 27.03.2018 01:32, kedar mhaswade wrote:
Typically, when one wants to run a Flink job on a Hadoop YARN
installation, one creates a Yarn session (e.g. ./bin/yarn-session.sh
-n 4 -qu test-yarn-queue) and runs intended Flink job(s) (e.g.
./bin/flink run -c MyFlinkApp -m job-manager-host:job-manager-port
<overriding app config params> myapp.jar) on the Flink cluster whose
job manager URL is returned by the previous command.
My questions are:
- Does yarn-session.sh need conf/flink-conf.yaml to be available in
Flink installation on every container in YARN? If this file is needed,
how can one run different YARN sessions (with potentially very
different configurations) on the same Hadoop YARN installation
simultaneously?
- Is it possible to start the YARN session programmatically? If yes, I
believe I should look at classes like YarnClusterClient
<https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/yarn/YarnClusterClient.html>.
Is that right? Is there any other guidance on how to do this
programmatically (e.g. I have a management UI that wants to start/stop
YARN sessions and deploy Flink jobs to it)?
Regards,
Kedar