Hi, We are in the process of deploying Kafka in our service. We need to decide the machine capacity plan, we arrived at the below formulae for deriving total machine capacity.
Total Broker machine size = Message size per second * Retention period * Replication Factor Am I need to consider the topic, index files in the calculation? Please help/guide me if i missing any param required in the formulae. Index file calculation (Reference <https://issues.apache.org/jira/browse/KAFKA-3300>) Currently, the initial/max size of offset index file is configured by log.index.max.bytes. This will be the offset index file size for the active log segment until it rolls out. Theoretically, we can calculate the upper bound of offset index size using the following formula: log.segment.bytes / index.interval.bytes * 8 With the default setting the bytes needed for an offset index size is 1GB / 4K * 8 = 2MB. And the default log.index.max.bytes is 10MB. Retention Period = Retention period + (log.retention.check.interval.ms + log.segment.delete.delay.ms ) / 1000 = 86400 + (30000 + 6000)/1000 = 86400 + 360 = 86760 Seconds Total Broker machine size = Message size per second * Retention period * Replication Factor = 90 MB/Sec * 86760 * 3 = 23425200 MB = 23.4252 Tb *Machine Configuration* 6 Brokers with 3 ZK *Kafka (per machine)* Disk Space - 2 * 2TB RAM - 128 GB CPU - 40 core Thanks and Regards Gowtham.S