[ https://issues.apache.org/jira/browse/KAFKA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106331#comment-16106331 ]
ASF GitHub Bot commented on KAFKA-5678: --------------------------------------- GitHub user xiguantiaozhan opened a pull request: https://github.com/apache/kafka/pull/3597 KAFKA-5678: When the broker graceful shutdown occurs, the producer side sends timeout. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xiguantiaozhan/kafka client-timeout-when-shuttingdwon-broker Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3597.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3597 ---- commit e4b45178693d77ecb783524b87029cc671bf8afc Author: tuyang <tuy...@meituan.com> Date: 2017-07-30T05:36:16Z improvement:When the broker graceful shutdown occurs, the producer side sends timeout. ---- > When the broker graceful shutdown occurs, the producer side sends timeout. > -------------------------------------------------------------------------- > > Key: KAFKA-5678 > URL: https://issues.apache.org/jira/browse/KAFKA-5678 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.9.0.0, 0.10.0.0, 0.11.0.0 > Reporter: tuyang > > Test environment as follows. > 1.Kafka version:0.9.0.1 > 2.Cluster with 3 broker which with broker id A,B,C > 3.Topic with 6 partitions with 2 replicas,with 2 leader partitions at each > broker. > We can reproduce the problem as follows. > 1.we send message as quickly as possible with ack -1. > 2.if partition p0's leader is on broker A and we graceful shutdown broker > A,but we send a message to p0 before the leader is reelect, so the message > can be appended to the leader replica successful, but if the follower replica > not catch it as quickly as possible, so the shutting down broker will create > a delayProduce for this request to wait complete until request.timeout.ms . > 3.because of the controllerShutdown request from broker A, then the p0 > partition leader will reelect > , then the replica on broker A will become follower before complete shut > down.then the delayProduce will not be trigger to complete until expire. > 4.if broker A shutdown cost too long, then the producer will get response > after request.timeout.ms, which results in increase the producer send latency > when we are restarting broker one by one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)