[ 
https://issues.apache.org/jira/browse/KAFKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654667#comment-13654667
 ] 

Joel Koshy commented on KAFKA-901:
----------------------------------

Haven't looked at the patch yet, but went through the overview. An alternate 
approach that we may want to consider is to maintain a metadata cache at every 
broker. The cache can be kept consistent by having the controller send a (new) 
update-metadata request to all brokers whenever it sends out a leaderAndIsr 
request. A new request type would avoid needing to "overload" the leader and 
isr request.

This would help avoid the herd effect of multiple clients flooding the 
controller with metadata requests (although these requests should return 
quickly with your patch).

                
> Kafka server can become unavailable if clients send several metadata requests
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-901
>                 URL: https://issues.apache.org/jira/browse/KAFKA-901
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>         Attachments: metadata-request-improvement.patch
>
>
> Currently, if a broker is bounced without controlled shutdown and there are 
> several clients talking to the Kafka cluster, each of the clients realize the 
> unavailability of leaders for some partitions. This leads to several metadata 
> requests sent to the Kafka brokers. Since metadata requests are pretty slow, 
> all the I/O threads quickly become busy serving the metadata requests. This 
> leads to a full request queue, that stalls handling of finished responses 
> since the same network thread handles requests as well as responses. In this 
> situation, clients timeout on metadata requests and send more metadata 
> requests. This quickly makes the Kafka cluster unavailable. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to