Yohan Sanchez created KAFKA-6827: ------------------------------------ Summary: Messages stuck after broker's multiple restart in a row Key: KAFKA-6827 URL: https://issues.apache.org/jira/browse/KAFKA-6827 Project: Kafka Issue Type: Bug Affects Versions: 1.1.0, 0.10.2.0 Reporter: Yohan Sanchez Attachments: kafka1.log, kafka2.log, producer.prop
Hello :) Tried with v0.10.2 and 1.1.0. I start with brand new brokers with no old data. Created topic test {code:java} /usr/share/kafka/bin/kafka-topics.sh --zookeeper $ZOOKEEPER --create --topic test --partitions 1 --replication-factor 2 --config retention.ms=604800000 && /usr/share/kafka/bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test Created topic "test". Topic:test PartitionCount:1 ReplicationFactor:2 Configs:retention.ms=604800000 Topic: test Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2,1{code} Logs from brokers available in attachment I start producing with a verifiable producer {code:java} /usr/share/kafka/bin/kafka-verifiable-producer.sh --topic test --broker-list localhost:9091 --max-messages 1000000 --throughput 10000 --producer.config producer.prop --value-prefix 1 > /tmp/produce_result.txt {code} During the production, i stop, start, stop start broker 1. {code:java} /etc/init.d/kafka1 stop && /etc/init.d/kafka1 start && /etc/init.d/kafka1 stop && /etc/init.d/kafka1 start {code} There is no data loss producer side: {code:java} {"timestamp":1524670799034,"name":"producer_send_success","key":null,"value":"1.999999","topic":"test","partition":0,"offset":999999} {"timestamp":1524670799040,"name":"shutdown_complete"} {"timestamp":1524670799042,"name":"tool_data","sent":1000000,"acked":1000000,"target_throughput":10000,"avg_throughput":9988.413440409126} {code} I consume messages with my simple shell consumer: {code:java} /usr/share/kafka/bin/kafka-simple-consumer-shell.sh --broker-list localhost:9091 --topic test --offset -2 2> /dev/null | grep -v "Reached" > /tmp/kafka_data_back.txt {code} I grep values "1." in the /tmp/kafka_data_back.txt {code:java} Every 0.1s: grep "1\." /tmp/kafka_data_back.txt | wc -l Wed Apr 25 17:48:46 2018 999937 {code} Got only 999 937 messages instead of 1 000 000 * I can restart the consumer any time, i will still got 999937. * Depending on the run, i will get more or less messages stuck. * I can restart kafka1, wait and restart kafka2, messages are still stuck. * I can produce more messages, this will not unlock the messages untill =~ 700 messages produced. * Disabling compression did not solve the problem. * Ack 1 or -1 got the same result. * Each run reproduce the problem. Starting from a brand new broker or not. Can you help me understand why messages are stuck ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)