Hi, This isn't a big issue, but I'm wondering if anyone knows what is going on. I have been running performance benchmarks for a Kafka 0.9 Consumer using a Kafka 0.9.0.0 (also tried 0.9.0.1) broker. At message sizes of 5k the broker becomes the bottleneck and throughput starts to become highly variable with throughput drops lasting 30-60 seconds. The problem is related to disk I/O, since moving the message logs to /dev/shm/ (ramdisk) led to steady throughput. There is no apparent correlation between message log deletion and broker throughput. Does anyone know what the broker might be doing that is causing these short drops? Setup is described below along with some results charts. Thanks!
- Dedicated 8-core(Intel(R) Xeon(R) CPU X5570 @ 2.93GHz) , rhel6, x86 hosts (1 Kafka 0.9.0 broker server, 1 Kafka Consumer host, 1 Kafka Producer host). /tmp Disk: Seagate Constellation ST9500530NS (500GB, 7200 RPM, 32MB Cache, SATA 3.0Gb/s, 2.5”) - Used non-dedicated XIV Fibre Channel storage server for Kafka Zookeeper. The disks were commodity 7200 RPM drives with non-volatile cache for fast storage. - 1 Gb shared network switch - Default server configurations were used except for log retention was set to 1,073,741,824 bytes. Throughput is highly variable when writing messages to /tmp Throughput is steady when writing to ramdisk. Here we see the drop, followed by the "catch-up" spike when we record at 15sec increments and throttle the producer at 6k messages/sec, therefore it doesn't only happen when the broker is pushed to its limit.