Hi Till,

What I was ideally looking for was to have a completely managed service for
Flink via AWS EMR in which YARN Cluster would be completely dedicated to
only one Flink Session and as the EMR scales in and out, EMR/YARN would
add/remove TMs accordingly. I could then get the value of total task slots
across all running TMs from Flink REST API and change my Job parallelism
accordingly. 

As I understand, this kind of feature is not currently available and new TMs
will not be started as EMR scales out. The only way to get new TMs would be
to scale the EMR cluster to the required size and change Job parallelism and
expect Flink and YARN to take care of spawning new TMs with Flink's Resource
elasticity. 

So, now we planning to poll YARN REST API to get current Active nodes which
will be equal to the number of TMs(Flink will be configured to run 1 TM per
EC2 instance) the cluster is capable of running and then modify Job
parallelism accordingly. Can you validate this strategy and/or suggest
something better?

Thanks,
Suraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to