Hi
I was referring to the article by Mr. June Rao about partitions in kafka 
cluster.  
https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/

"A rough formula for picking the number of partitions is based on throughput. 
You measure the throughout that you can achieve on a single partition for 
production (call it p) and consumption (call it c). Let's say your target 
throughput is t. Then you need to have at least max(t/p, t/c) partitions."

I have the data pipeline as below.

Filebeat-->Kafka-->Logstash-->Elasticsearch
There are many filebeat agents sending data to kafka. I want to understand , 
how can I measure the events per seconds getting written to Kafka? This will 
help me to know 'p'  in above formula.
I can measure the consumer throughput by monitoring logsatsh pipelines on 
Kibana. So it will give me 'c' in above formula.

I know target throughput in my cluster, that is 't'. 30k events/s.

Please let me know if I am going wrong?

Regards,
Sunil.
CONFIDENTIAL NOTE:
The information contained in this email is intended only for the use of the 
individual or entity named above and may contain information that is 
privileged, confidential and exempt from disclosure under applicable law. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this message in error, please 
immediately notify the sender and delete the mail. Thank you.

Reply via email to