Hi I was referring to the article by Mr. June Rao about partitions in kafka cluster. https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
"A rough formula for picking the number of partitions is based on throughput. You measure the throughout that you can achieve on a single partition for production (call it p) and consumption (call it c). Let's say your target throughput is t. Then you need to have at least max(t/p, t/c) partitions." I have the data pipeline as below. Filebeat-->Kafka-->Logstash-->Elasticsearch There are many filebeat agents sending data to kafka. I want to understand , how can I measure the events per seconds getting written to Kafka? This will help me to know 'p' in above formula. I can measure the consumer throughput by monitoring logsatsh pipelines on Kibana. So it will give me 'c' in above formula. I know target throughput in my cluster, that is 't'. 30k events/s. Please let me know if I am going wrong? Regards, Sunil. CONFIDENTIAL NOTE: The information contained in this email is intended only for the use of the individual or entity named above and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete the mail. Thank you.