Hi Antonio,

According to experience in our production, Flink totally can handle 150
TaskManagers per cluster. Actually, we have encountered much larger jobs
with thousands that each single job demands thousands of TaskManagers.
However, as the job scale increases, it gets harder to achieve good
stability. Because there are more tasks, thus higher chance of job failover
(or region failover if possible) caused by a single task failure. So if you
don't have jobs as large as that scale, I think 150 TaskManagers per
cluster would be a good choice.

In case you do encounter a JobManager performance bottleneck, usually it
can be solved by increasing the JobManager's resources with a '-jm'
argument.

Thank you~

Xintong Song



On Fri, May 24, 2019 at 2:33 AM Antonio Verardi <anto...@yelp.com> wrote:

> Hello Flink users,
>
> How many task managers one can expect a Flink cluster to be able to
> reasonably handle?
>
> I want to move a pretty big cluster from a setup on AWS EMR to one based
> on Kubernetes. I was wondering whether it makes sense to break up the beefy
> task managers the cluster had in something like 150 task manager containers
> of a slot each. This is a pattern that a couple different people I met at
> meetups told me they are using in production, but I don't know if they
> tried something similar at this scale. Would the jobmanager be able to
> manage so many task managers in your opinion?
>
> Cheers,
> Antonio
>

Reply via email to