Team, *Use-case :* *IMAP* . I have an application in which an org has users , who use IMAP to send mails, where the mail contents are produced to kafka.
Here the scaling factors are 1. org can grow from 1 to million 2. users can grow from 1 to million. For this use-case, I need to calculate the producer rate and broker response rate for a single machine. So far we have identified, the factors that will be involved in producer-rate are : 1. Message size 2. Request size 3. Request rate overhead 4. Request latency 5. Round Trip Time 6. Number of Sender Threads 7. Number of Processor Threads at Broker 8. Replication factor Variables identified at Network layer, Kernel, NIC : 1. sysctl_wmem 2. Tx queues 3. Ring Buffer 4. Driver Queue 5. NAPI Polling Observations made so far : 1. SocketChannel is the one who is the entry point of sending data at the application level. 2. sendfile() system call used to transfer the data. *Questions* : 1. How data is transferred from SocketChannel to NIC ? (ie) The data-flow in-terms of network(protocol) layer, kernel, network device drivers, NIC . 2. Since, each KafkaProducer instance will create an SocketChannel.What is the maximum number of producer instances , a machine can have to utilise the network in an efficient manner ? 3. In-addition to the above listed variables, 1. What are the list of variables involved in sending data in the network layer ? 2. What are the list of variables involved in sending data in the kernel ? 3. What are the list of variables involved in sending data to NIC ? 4. How to frame the producer rate in-terms of the variables identified in each layer ? 5. *With the given machine hardware, how to precisely frame the producer rate in a single formula in-terms of hardware and software level ?* Anyone, Please help me in identifying the variables and also in-corporate those variables in a single formula to frame the producer-rate for a machine in-terms of producer instances. Thanks in advance. PS : I have already came across the following documents - https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/ - https://cwiki.apache.org/confluence/display/KAFKA/Performance+testing - https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600 Regards, Girija A.