Hi, I am trying to understand how fast is kafka 0.7 compared to what I can get from hard drive. In essence I have 3 questions.
In all tests below, I'm using single broker with single one-partitioned topic. Kafka perf tests have been run in 2 deployment configs: - broker, perf-test on same host - broker, perf-test on different hosts (the results are practically the same, so wont post them here) I'm using FIO(http://freecode.com/projects/fio) to benchmark speed of hard drives. Hardware I'm using: 1) m1.xlarge with ephemeral storage, 4 core cpu, 16 GB ram 2) hi1.4xlarge with SSD, 16 core cpu, 64 GB ram 3) desktop machine with 7200 rpm sata, 4 core cpu, 8 GB ram Kafka broker config: Oracle jdk 1.6.0_38, -Xmx2048 socket.send.buffer=16777216 socket.receive.buffer=16777216 max.socket.request.bytes=104857600 log.flush.interval=10000 log.default.flush.interval.ms=1000 log.default.flush.scheduler.interval.ms=1000 num.threads=[num of cores] For kafka-producer-perf-test I'm assuming that IO access pattern is sequential write. Here is the test I ran with FIO: [sequential-write] rw=write size=50G ioengine=sync numjobs=1 directory=/tmp/fio filename=redo01.log Here is kafka performance test: ./bin/kafka-producer-perf-test.sh -topic "perf" --batch-size 3000 --messages 50000000 --message-size 1300 --brokerinfo broker.list=0:host:9092 --threads [number-of-cores] ---------------------------------------------------------------------------------------- | | m1.xlarge | hi1.4xlarge | desktop | ---------------------------------------------------------------------------------------- | kafka | 41 MB/s | 217 MB/s | 42 MB/s | ----------------------------------------------------------------------------------------- | fio | 106 MB/s | 377 MB/s | 74 MB/s | ---------------------------------------------------------------------------------------- Question 1: The proportion (~1/2) is pretty stable against different kind of hardware I've tried. Is it as expected? Can something be done to improve this? I've tried to play with: log.flush.interval=10000 log.default.flush.interval.ms=1000 log.default.flush.scheduler.interval.ms=1000 Like increasing 10 times, or decreasing 10 times, but haven't seen much of a difference in IO throughput The other thing that bugs me much more is that kafka consumer speed on cold IO cache is like 5-50 times slower from what I can get with "sequential read" fio test. For kafka-consumer-perf-test I'm assuming that IO access pattern is sequential read. Here is FIO test: [sequential-read] rw=read size=50G ioengine=sync # I know that kafka use sendfile, but sync should be slower, right? numjobs=1 directory=/tmp/fio filename=redo01.log Here what I'm doing with kafka-consumer-perf-test: kafka-consumer-perf-test.sh -topic "perf" --messages 50000000 --zookeeper host:2181 --threads 1 --socket-buffer-size 16777216 --fetch-size 16777216 The broker config is the same. I'm dropping IO cache before running tests: echo 3 > /proc/sys/vm/drop_caches ----------------------------------------------------------------------------------------------- | | m1.xlarge | hi1.4xlarge | desktop | --------------------------------------------------------------------------------------------- | kafka | 25 MB/s | 10 MB/s (???) | 20 MB/s | --------------------------------------------------------------------------------------------- | fio | 130 MB/s | 450 MB/s | 67 MB/s | ---------------------------------------------------------------------------------------------- Question 2: Can something be done to improve consumer performance? Question 3 (most improtant for me): What might be the reasons for consumer to behave so badly on fastest hardware available? I see in iostat, that consumer really does very little read requests to hard drive Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdb 0.00 0.00 144.00 0.00 6144.00 0.00 85.33 0.06 0.42 0.42 0.00 0.08 1.20 And cpus are idling avg-cpu: %user %nice %system %iowait %steal %idle 2.16 0.00 0.09 0.06 0.03 97.66 Besides that, even if the whole topic is in IO cache, the consumer speed is about 45 MB/s which is still quite below my expectations. And the picture doesn't change in different deployment configs (broker and test on same node or 2 different nodes) Any ideas why this might happen? Rafael Bagmanov, Grid Dynamics.