Hi Joey,

Quick question: by *nodes*, do you mean Flink task manager processes, or
physical/virtual machines (like ecs, yarn NM)?

In our production, we run flink workloads on several Yarn/Kubernetes
clusters, where each cluster typically has 2k~5k machines. Most Flink
workloads are deployed in single-job mode, giving us thousands (sometimes
more than 10k) of flink instances concurrently running on each cluster. In
this way, the scale of each flink instance is usually not extremely large
(less than 1000 TMs), and we rely on the power of Yarn/Kubernetes to deal
with the large number of instances.

There're also cases that a single flink job is extremely large. We had a
batch workload from last year's double-11 event, with 8k max per-stage
parallelism and up to 30k task managers running at the same time. At that
scale, we run into problems with the single-point JobManager process, such
as tremendous memory consumption, buzy rpc main thread, etc. To make that
case work, we did many optimizations on our internal flink version, which
we are trying to contribute to the community. See FLINK-21110 [1] for the
details.

Thank you~

Xintong Song


[1] https://issues.apache.org/jira/browse/FLINK-21110

On Thu, Mar 4, 2021 at 3:39 PM Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi Joey,
>
> Sorry for not responding to your question sooner. As you can imagine there
> are not many users running Flink at such scale. As far as I know, Alibaba
> is running the largest/one of the largest clusters, I'm asking for someone
> who is familiar with those deployments to take a look at this conversation.
> I hope someone will respond here soon :)
>
> Best,
> Piotrek
>
> pon., 1 mar 2021 o 14:43 Joey Tran <joey.t...@schrodinger.com> napisaƂ(a):
>
>> Hi, I was looking at Apache Beam/Flink for some of our data processing
>> needs, but when reading about the resource managers
>> (YARN/mesos/Kubernetes), it seems like they all top out at around 10k
>> nodes. What are recommended solutions for scaling higher than this?
>>
>> Thanks in advance,
>> Joey
>>
>

Reply via email to