[ https://issues.apache.org/jira/browse/KAFKA-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277807#comment-14277807 ]
Dana Powers commented on KAFKA-1649: ------------------------------------ I am only testing from the wire-protocol level. Running a broker failure test with 2 brokers, 1 topic w/ num.partitions=2 and default.replication.factor=2 . Send 100 random messages directly to partition 0, kill the leader for partition 0, attempt to write messages to partition 0, with retries and metadata reloads. Running the test against 0.8.2.0 returns ReplicaNotAvailable error code in the PartitionMetadata, whereas 0.8.1.1 does not. This is the metadata w/ both brokers up: Topic metadata: [TopicMetadata(topic='test_switch_leader-qkUBJTZGLA', error=0, partitions=[PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=1, leader=1, replicas=(1, 0), isr=(1, 0), error=0), PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=0, leader=0, replicas=(0, 1), isr=(0, 1), error=0)])] And this is the metadata after killing one broker (0.8.2.0): Topic metadata: [TopicMetadata(topic='test_switch_leader-qkUBJTZGLA', error=0, partitions=[PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=1, leader=1, replicas=(1,), isr=(1,), error=9), PartitionMetadata(topic='test_switch_leader-qkUBJTZGLA', partition=0, leader=1, replicas=(1,), isr=(1,), error=9)])] The 0.8.1.1 output is slightly different -- and significantly no error in PartitionMetadata Before killing partition 0 leader (broker 1): Topic metadata: [TopicMetadata(topic='test_switch_leader-eMbCMlVrOC', error=0, partitions=[PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=0, leader=1, replicas=(1, 0), isr=(1, 0), error=0), PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=1, leader=0, replicas=(0, 1), isr=(0, 1), error=0)])] After killing partition 0 leader: Topic metadata: [TopicMetadata(topic='test_switch_leader-eMbCMlVrOC', error=0, partitions=[PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=0, leader=0, replicas=(1, 0), isr=(0,), error=0), PartitionMetadata(topic='test_switch_leader-eMbCMlVrOC', partition=1, leader=0, replicas=(0, 1), isr=(0, 1), error=0)])] > Protocol documentation does not indicate that ReplicaNotAvailable can be > ignored > -------------------------------------------------------------------------------- > > Key: KAFKA-1649 > URL: https://issues.apache.org/jira/browse/KAFKA-1649 > Project: Kafka > Issue Type: Improvement > Components: website > Affects Versions: 0.8.1.1 > Reporter: Hernan Rivas Inaka > Priority: Minor > Labels: protocol-documentation > Original Estimate: 10m > Remaining Estimate: 10m > > The protocol documentation here > https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ErrorCodes > should indicate that error 9 (ReplicaNotAvailable) can be safely ignored on > producers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)