Thanks for the info, Xintong Song! Cheers, Antonio
On Fri, May 24, 2019 at 3:38 AM Xintong Song <tonysong...@gmail.com> wrote: > Hi Antonio, > > According to experience in our production, Flink totally can handle 150 > TaskManagers per cluster. Actually, we have encountered much larger jobs > with thousands that each single job demands thousands of TaskManagers. > However, as the job scale increases, it gets harder to achieve good > stability. Because there are more tasks, thus higher chance of job failover > (or region failover if possible) caused by a single task failure. So if you > don't have jobs as large as that scale, I think 150 TaskManagers per > cluster would be a good choice. > > In case you do encounter a JobManager performance bottleneck, usually it > can be solved by increasing the JobManager's resources with a '-jm' > argument. > > Thank you~ > > Xintong Song > > > > On Fri, May 24, 2019 at 2:33 AM Antonio Verardi <anto...@yelp.com> wrote: > >> Hello Flink users, >> >> How many task managers one can expect a Flink cluster to be able to >> reasonably handle? >> >> I want to move a pretty big cluster from a setup on AWS EMR to one based >> on Kubernetes. I was wondering whether it makes sense to break up the beefy >> task managers the cluster had in something like 150 task manager containers >> of a slot each. This is a pattern that a couple different people I met at >> meetups told me they are using in production, but I don't know if they >> tried something similar at this scale. Would the jobmanager be able to >> manage so many task managers in your opinion? >> >> Cheers, >> Antonio >> >