Hi Anil,
A typical Yarn Resource Manager setting consist of 2 RM nodes [1] for
active/standby setup.
FYI: We've also shared some practical experiences for the limitation of
this setup, and potential redundant fail-save mechanisms in our latest
talk[2] in this year's FlinkForward.
Thanks,
Rong
[1
Thanks for the clarification Rong!
As per my understanding, the Docker containers monitors the job Flink Job
which are running in Yarn Cluster. Flink JM's have HA enabled. So there's a
standby JM in case the JM fails and in case of TM failure, that TM will be
re-deployed. All good. My concern is wh
Hi Anil,
The reason why we are using Docker is because internally we support
Dockerized container for microservices.
Ideally speaking this can be any external service running on something
other than the actual YARN cluster you Flink application resides. Basically
watchdog runs outside of the Flin
Thanks Rong. FlinkForward talk was insightful.
One more question, it's mentioned in the talk that the jobs are running on
Yarn and are monitored by containers running on Docker. Can you explain why
is Docker needed here. When we deploy job to Yarn, one Yarn container is
already dedicated for Job M
Hi Anil,
We have a presentation[1] that briefly discuss the higher level of the
approach (via watchdog) in FlinkForward 2018.
We are also restructuring the approach of our open-source AthenaX:
Right now our internal implementation has diverged from the open-source for
too long, it has been a prob
Thanks for the reply Rong. Can you please let me know the design for the
auto-scaling part, if possible.
Or guide me in the direction so that I could create this feature myself.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Anil,
Thanks for reporting the issue. I went through the code and I believe the
auto-scaling functionality is still in our internal branch and has not been
merged to the open-source branch yet.
I will change the documentation accordingly.
Thanks,
Rong
On Mon, May 6, 2019 at 9:54 PM Anil wrot
I'm using Uber Open Source project Athenax. As mentioned in it's docs[1] it
supports `Auto scaling for AthenaX jobs`. I went through the source code on
Github but didn't find the auto scaling part. Can someone aware of this
project please point me in the right direction here.
I'm using Flink's