Hi,
I am building Kafka cluster and run producer perf test to get Kafka latency
performance.
>From test result, I notice that the long tail latency is very high and
increased with time passing by although the 99.9% result looks very good.
The worst latency can reach more than 1 second. Besides, disk utilization
is always very low, never more than 1%. I also try to tune
log.flush.interval.ms from 1000ms to 200ms. It does not help much.

Below is the max latency chart, Y axis represents the max latency in
millisecond, X axis represents the time elapsed in milliseconds. From
chart, we can see the latency increasing from about 10ms to 1095ms
gradually.

[image: Inline image]

Kafka cluster is built up with 4 hosts. The version is 2.9.2-0.8.2-beta.
The PerfTopic15 topic is created with 3 partition and 3 replication.

Here is my perf script usage:
-bash-4.1$ bin/kafka-producer-perf-test.sh   --broker-list <broker
list> --topics *PerfTopic15* --sync --initial-message-id 1 --messages
200000 --csv-reporter-enabled --metrics-dir /tmp/PerfTopic15_1
--message-send-gap-ms 20* --request-num-acks -1* --batch-size 1

-bash-4.1$ bin/kafka-topics.sh  --zookeeper <zkHost>:2181  --describe
--topic *PerfTopic15*
Topic:PerfTopic15 PartitionCount:3 ReplicationFactor:3 Configs:
Topic: PerfTopic15 Partition: 0 Leader: 3 Replicas: 3,4,1 Isr: 3,4,1
Topic: PerfTopic15 Partition: 1 Leader: 4 Replicas: 4,1,2 Isr: 4,1,2
Topic: PerfTopic15 Partition: 2 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3

I expect the worst latency not exceed 100 milliseconds. But the test result
is very discouraging. Do you have some points about Kafka long tail latency
issue?

Hope for your reply! Thanks in advance!

Reply via email to