[ 
https://issues.apache.org/jira/browse/KAFKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660835#comment-13660835
 ] 

Jun Rao commented on KAFKA-901:
-------------------------------

Thanks for patch v4. A few more comments:

40. KafkaController.ControllerContext: This is not introduced in this patch, 
but serveOrShuttingDownBrokerIds should just be liveBrokerIdUnderlying.

41.ControllerBrokerRequestBatch: Instead of maintaining aliveBrokers, could we 
just get it from controllerContext?

42. KafkaApis.handleTopicMetadataRequest:
42.1 We can rewrite the following statement
      val partitionMetadata = sortedPartitions.map { partitionReplicaMap => 
   to 
      val partitionMetadata = sortedPartitions.map { case(topicAndPartition, 
partitionState) =>
 Then, we don't have to redefine topicAndPartition and partitionState.
42.2 The val partitionReplicaAssignment seems unintuitive. Should we rename it 
to partitionStateInfo?
42.3 The outermost try/catch is unnecessary since it should be handled by the 
caller handle().
42.4 Not sure if we need to log the following error since it's either due to 
LeaderNotAvailable or ReplicaNotAvailable, both are expected.
  error("Error while fetching topic metadata for topic %s due to %s 
".format(topicMetadata.topic,

43. UpdateMetadataRequest: This class needs to define handleError(). This 
method is actually required to be defined in every request. So we should remove 
the empty body of handleError() in RequestOrResponse.

44. UpdateMetadataResponse: Do we really need the per partition level error 
code? It seems that a global error code is enough.

45. ConsumerFetcherManager: We should put the following statement under 
logger.isDebugEnabled().
          topicsMetadata.foreach(topicMetadata => 
debug(topicMetadata.toString()))

46. TopicMetadata.toString(): It only prints the leader. We need to print other 
fields in PartitionMetadata too.

47. BrokerPartitionInfo: If the metadata response has no error, it seems that 
we show throw an UnknownTopicOrPartitionException, instead of a KafkaException. 
Alternatively, should we not throw exception at all in this case since the 
caller already has to deal with the case when there is no metadata?

48. AsyncProducerTest.testInvalidPartition(): The message in the following 
statement is a bit missing leading. It's probably better to say sth like 
"Should not thrown any exception". Actually, instead of catching just 
UnknownTopicOrPartitionException, we should catch and fail any exception.
        case e: UnknownTopicOrPartitionException => fail("Should fail with 
UnknownTopicOrPartitionException")
                
> Kafka server can become unavailable if clients send several metadata requests
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-901
>                 URL: https://issues.apache.org/jira/browse/KAFKA-901
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>         Attachments: kafka-901.patch, kafka-901-v2.patch, kafka-901-v4.patch, 
> metadata-request-improvement.patch
>
>
> Currently, if a broker is bounced without controlled shutdown and there are 
> several clients talking to the Kafka cluster, each of the clients realize the 
> unavailability of leaders for some partitions. This leads to several metadata 
> requests sent to the Kafka brokers. Since metadata requests are pretty slow, 
> all the I/O threads quickly become busy serving the metadata requests. This 
> leads to a full request queue, that stalls handling of finished responses 
> since the same network thread handles requests as well as responses. In this 
> situation, clients timeout on metadata requests and send more metadata 
> requests. This quickly makes the Kafka cluster unavailable. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to