[ 
https://issues.apache.org/jira/browse/KAFKA-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326383#comment-15326383
 ] 

Stevo Slavic edited comment on KAFKA-1452 at 6/12/16 11:32 AM:
---------------------------------------------------------------

This bug is still present in 0.10.0.0, I reproduced it by starting clean 
cluster with 1 zookeeper and 3 brokers, creating single topic with 2 partitions 
and replication factor of 2, then stopped non controller broker and finally 
stopped controller broker. Only remaining broker would become controller but 
partition that lost all replicas would still be labeled that it has one 
remaining even in-sync replica - the dead initial controller broker.

Things are fine if non-controller broker stopped is one that is in replica 
assignment of partition together with another non-controller broker. So problem 
affects only partitions for which controller is part of replica set, and when 
controller is last replica to be stopped. Not even killing brokers is needed to 
reproduce issue.

I wish one could choose which brokers in cluster are controller only and which 
are data only (see related KAFKA-2310).


was (Author: sslavic):
This bug is still present in 0.10.0.0, I reproduced it by starting clean 
cluster with 1 zookeeper and 3 brokers, creating single topic with 2 partitions 
and replication factor of 2, then stopped non controller broker and finally 
stopped controller broker. Only remaining broker would become controller but 
partition that lost all replicas would still be labeled that it has one 
remaining replica - dead initial controller broker.

Things are fine if non-controller broker stopped is one that is in replica 
assignment of partition together with another non-controller broker. So problem 
affects only partitions for which controller is part of replica set, and when 
controller is last replica to be stopped. Not even killing brokers is needed to 
reproduce issue.

I wish one could choose which brokers in cluster are controller only and which 
are data only (see related KAFKA-2310).

> Killing last replica for partition doesn't change ISR/Leadership if replica 
> is running controller
> -------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1452
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1452
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Alexander Demidko
>            Assignee: Neha Narkhede
>
> Kafka version is 0.8.1.1. We have three machines: A,B,C. Let’s say there is a 
> topic with replication 2 and one of it’s partitions - partition 1 is placed 
> on brokers A and B. If the broker A is already down than for the partition 1 
> we have: Leader: B, ISR: [B]. If the current controller is node C, than 
> killing broker B will turn partition 1 into state: Leader:  -1, ISR: []. But 
> if the current controller is node B, than killing it won’t update 
> leadership/isr for partition 1 even when controller will be restarted on node 
> C, so partition 1 will forever think it’s leader is node B which is dead.
> It looks that KafkaController.onBrokerFailure handles situation when the 
> broker down is the partition leader - it sets the new leader value to -1. To 
> the contrary, KafkaController.onControllerFailover never removes leader from 
> the partition with all replicas offline - allegedly because partition gets 
> into ReplicaDeletionIneligible state. Is it intended behavior?
> This behavior affects DefaultEventHandler.getPartition in the null key case - 
> it can’t determine partition 1 as having no leader, and this results into 
> events send failure.
> What we are trying to achieve - is to be able to write data even if some 
> partitions lost all replicas, which is rare yet still possible scenario. 
> Using null key looked suitable with minor DefaultEventHandler modifications 
> (like getting rid from DefaultEventHandler.sendPartitionPerTopicCache to 
> avoid caching and uneven events distribution) as we neither use logs 
> compaction nor rely on partitioning of the data. We had such behavior with 
> kafka 0.7 - if the node is down, simply produce to a different one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to