[ 
https://issues.apache.org/jira/browse/KAFKA-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wateray updated KAFKA-3885:
---------------------------
    Description: 
This bug can reproduce by the following steps.
The cluster has 2 brokers.
 a) start a new producer, then send messages, it works well.
 b) Then kill one broker,  it works well.
 c) Then restart the broker,  it works well.
 d) Then kill the other broker,  the producer can't failover.

The the producer print log infinity.
org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) 
expired due to timeout while requesting metadata from brokers for 
lwb_test_p50_r2-29


================
When producer sends msg, it detected that metadata should update.
But at this code, class: NetworkClient ,method: leastLoadedNode
List<Node> nodes = this.metadataUpdater.fetchNodes();

nodes only return one result, and the returned node is the killed node, so the 
producer cannot failover!








  was:
This bug can reproduce by the following steps.
The cluster has 2 brokers.
 a) start a new producer, then send message, it works well.
 b) Then kill one broker,  it works well.
 c) Then restart the broker,  it works well.
 d) Then kill the other broker,  the producer can't failover.

The the producer print log infinity.
org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) 
expired due to timeout while requesting metadata from brokers for 
lwb_test_p50_r2-29


================
When producer sends msg, it detected that metadata should update.
But at this code, class: NetworkClient ,method: leastLoadedNode
List<Node> nodes = this.metadataUpdater.fetchNodes();

nodes only return one result, and the returned node is the killed node, so the 
producer cannot failover!









> Kafka new producer cannot failover
> ----------------------------------
>
>                 Key: KAFKA-3885
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3885
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.9.0.0, 0.8.2.2, 0.9.0.1, 0.10.0.0
>            Reporter: wateray
>
> This bug can reproduce by the following steps.
> The cluster has 2 brokers.
>  a) start a new producer, then send messages, it works well.
>  b) Then kill one broker,  it works well.
>  c) Then restart the broker,  it works well.
>  d) Then kill the other broker,  the producer can't failover.
> The the producer print log infinity.
> org.apache.kafka.common.errors.TimeoutException: Batch containing 1 record(s) 
> expired due to timeout while requesting metadata from brokers for 
> lwb_test_p50_r2-29
> ================
> When producer sends msg, it detected that metadata should update.
> But at this code, class: NetworkClient ,method: leastLoadedNode
> List<Node> nodes = this.metadataUpdater.fetchNodes();
> nodes only return one result, and the returned node is the killed node, so 
> the producer cannot failover!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to