[ 
https://issues.apache.org/jira/browse/KAFKA-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumant Tambe updated KAFKA-4089:
--------------------------------
    Description: 
The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  ejects 
batches out when the cluster metadata needs an update 
({{Metadata.timeToNextUpdate==0}}). In this case, no nodes are "ready" to send 
data to ({{result.readyNodes}} is empty). As a consequence, {{Sender.drain}} 
does not drain any batch at all and therefore no new topic-partitions are 
muted. 

The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  
bypasses muted partitions only. As there are no new muted partitions, all 
batches, regardless of topic-partition, are subject to expiration. As a result, 
a group of batches expire if they linger in the queue for longer than 
{{requestTimeout}}.

Expiring batches unconditionally is a bug. It's too greedy. 

The current condition in {{abortExpiredBatches}} that bypasses muted partitions 
is necessary but not sufficient. It should additionally bypass partitions for 
which leader information is known and fresh. 

Conversely, it should expire batches only when the following is true
# !muted AND
# meta-data is fresh but leader not available 

  was:
The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  ejects 
batches out the cluster metadata needed an update 
({{Metadata.timeToNextUpdate==0}}). In this case, no nodes are "ready" to send 
data to ({{result.readyNodes}} is empty). As a consequence, {{Sender.drain}} 
does not drain any batch at all and therefore no new topic-partitions are 
muted. 

The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  
bypasses muted partitions only. As there are no new muted partitions, all 
batches, regardless of topic-partition, are subject to expiration. As a result, 
a group of batches expire if they linger in the queue for longer than 
{{requestTimeout}}.

Expiring batches unconditionally is a bug. It's too greedy. 

The current condition in {{abortExpiredBatches}} that bypasses muted partitions 
is necessary but not sufficient. It should additionally bypass partitions for 
which leader information is known and fresh. 

Conversely, it should expire batches only when the following is true
# !muted AND
# meta-data is fresh but leader not available 


> KafkaProducer raises Batch Expired exception 
> ---------------------------------------------
>
>                 Key: KAFKA-4089
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4089
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.10.0.1
>            Reporter: Sumant Tambe
>            Assignee: Dong Lin
>
> The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  
> ejects batches out when the cluster metadata needs an update 
> ({{Metadata.timeToNextUpdate==0}}). In this case, no nodes are "ready" to 
> send data to ({{result.readyNodes}} is empty). As a consequence, 
> {{Sender.drain}} does not drain any batch at all and therefore no new 
> topic-partitions are muted. 
> The batch expiration logic ({{RecordAccumualator.abortExpiredBatches}})  
> bypasses muted partitions only. As there are no new muted partitions, all 
> batches, regardless of topic-partition, are subject to expiration. As a 
> result, a group of batches expire if they linger in the queue for longer than 
> {{requestTimeout}}.
> Expiring batches unconditionally is a bug. It's too greedy. 
> The current condition in {{abortExpiredBatches}} that bypasses muted 
> partitions is necessary but not sufficient. It should additionally bypass 
> partitions for which leader information is known and fresh. 
> Conversely, it should expire batches only when the following is true
> # !muted AND
> # meta-data is fresh but leader not available 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to