We are testing kafka’s performance with the real prod data and plan to test things like the below. We would have producers publishing and consumers processing production data on a separate non-prod kafka cluster.
* Impact of number of Partitions per Topic on throughput and latency on Producer & Consumer * Impact of scaling-up Brokers on throughput and latency * adding more brokers Vs adding more Disk on existing Brokers. How does the network interface usage differ? * cost of Replication on Throughput and Latency * impact of Broker vm.swappiness = 60 Vs vm.swappiness = 1 * partitions on a Broker pointing to single Disk Vs multiple Disks * EXT4 Vs XFS Filesystem on broker * behavior when Broker “num.io<http://num.io/>.threads” is increased from 8 to higher value * behavior when Broker “num.network.threads” is increased from 3 to higher value * behavior when the data segment size is increased from 1 GB (current setting) * producer “acks = 1” Vs “acks = all” (current setting) impact on throughput and latency * producer sending with Compression enabled (snappy?) Vs sending without Compression * setting producer batch-size (memory based) Vs record-count (current setting) per batch sent to Kafka * impact of message size throughput * Consumers fetching records from page-cache Vs fetching records from Disk Ideally, the metrics we would like to compare for each test are (please let know if there are anything else to be compared) * Producer write Throughput * Producer write Latency (ms) * Consumption Throughput * Consumption Latency (ms) * End-to-end Latency What would be the right tools to collect and compare the above metrics against different Tests? I have setup kafka-monitor but couldn’t find how to track the throughput and latency. Kafka-web-console seems to have some of these available? Kafka-Manager? Burrow? Anything else? Thank you. Since we are going to use our own producers and consumers, I do not think it makes sense to use tools like kafka-consumer-perf-test.sh or kafka-producer-perf-test.sh, but please correct if I am wrong. Thanks, Revin