Re: Regarding the size of Flink cluster

Timo Walther Fri, 10 Dec 2021 05:30:37 -0800

Hi Jessy,

let me try to answer some of your questions.


> 16 Task Managers with 1 task slot and 1 CPU each

Every additional task manager also involves management overhead. So Iwould suggest option 1. But in the end you need to perform somebenchmarks yourself. I could also imagine that a mixture could bebeneficial depending on the isolation level of your pipelines.


> there will be a single copy in the HEAP

From the JavaDocs of BroadcastState:

"Each operator instance individually maintains and stores elements inthe broadcast state."


I would assume that the HEAP contains n copies where n is the parallelism.

> I can process only n events/seconds(if the latency of the pipeline is1s.)

Latency is not necessarily thoughput. This depends on the pipeline. Forexample, if the pipeline contains only map functions without any keyBy.The operators are "chained" together. You can also influence thechaining [1] for better resource utilization. If you pipeline containsI/O to external systems, you can use async IO to increase thethroughput. [2]


I would also recommend this section here:

https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/flink-architecture/#task-slots-and-resources

Maybe you can elaborate on the operations in your pipeline.

> Can Flink process multiple events from the same key at the same time?

No, every key is routed to and executed by the same thread. However, youcreate an artificial key to spread the load more evenly.


> any blogs regarding the results of Flink's load testing

I would also recommend the FlinkForward YouTube channel. A lot of usersstories including actual numbers and configurations are shown there.


Regards,
Timo

[1]https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#task-chaining-and-resource-groups[2]https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/



On 10.12.21 11:46, Jessy Ping wrote:

Hi All,

I have the following questions regarding the sizing of the Flink clusterdoing stateful computation using Datastream API. It will be better ifthe community can answer the below questions or doubts.




Suppose we have a pipeline as follows,

*Kafka real time events source1 & Kafka rules source 2 ->KeyedBroadcastProcessFunction -> Kafka Sink*

As you can see, we will be processing the real-time events from theKafka source using the rules broadcasted from the rule source with thehelp of keyed broadcast function.



_Questions_


  * I have a machine with 16 CPUs and 32 GB Ram. Which configuration is
    efficient for achieving the target parallelism of 16?

 1. A single task manager with 16 task slots
 2. 16 Task Managers with 1 task slot and 1 CPU each.



  * If I have a broadcast state in my pipeline and I have a single task
    manager with 16 task slots for achieving the target parallelism of
    16. Does Flink keep 16 copies of broadcast state in the single task
    manager or there will be a single copy in the HEAP for the entire
    task slots?


  * If a parallelism of n means, I can process only n events/seconds(if
    the latency of the pipeline is 1s.). How many requests a single task

slot (containing a single task) can execute at a time ?


  * Can Flink process multiple events from the same key at the same time?


  * I have found the following blog regarding the Flink cluster size,
    
https://www.ververica.com/blog/how-to-size-your-apache-flink-cluster-general-guidelines
    
<https://www.ververica.com/blog/how-to-size-your-apache-flink-cluster-general-guidelines>.
    Do we have some other blogs, testimonials, or books regarding the
    sample production setup/configuration of a Flink cluster for

achieving different ranges of throughput ?


  * Are there any blogs regarding the results of Flink's load testing
    results ?


Thanks

Jessy

Re: Regarding the size of Flink cluster

Reply via email to