[ https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105587#comment-15105587 ]
Flavio Junqueira commented on KAFKA-3083: ----------------------------------------- [~junrao] bq. does controller A just resume from where it's left off? Or does it ignore all outstanding events and re-read all subscribed ZK paths (since there could be missing events between the connection loss event and the SyncConnected event)? I don't see a reason for ignoring outstanding events and re-reading zk state. If the session hasn't expired, then the broker is still the controller and I'd say it is safe to assume the no other controller work happened in parallel. bq. ZkClient actually hides the ZK ConnectionLoss event and only informs the application when the ZK session expires. To pursue this, we will have to access ZK directly. I think further down you noted that ZkClient actually exposes the connection loss event, but does put a thread in the middle. > a soft failure in controller may leave a topic partition in an inconsistent > state > --------------------------------------------------------------------------------- > > Key: KAFKA-3083 > URL: https://issues.apache.org/jira/browse/KAFKA-3083 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.9.0.0 > Reporter: Jun Rao > Assignee: Mayuresh Gharat > > The following sequence can happen. > 1. Broker A is the controller and is in the middle of processing a broker > change event. As part of this process, let's say it's about to shrink the isr > of a partition. > 2. Then broker A's session expires and broker B takes over as the new > controller. Broker B sends the initial leaderAndIsr request to all brokers. > 3. Broker A continues by shrinking the isr of the partition in ZK and sends > the new leaderAndIsr request to the broker (say C) that leads the partition. > Broker C will reject this leaderAndIsr since the request comes from a > controller with an older epoch. Now we could be in a situation that Broker C > thinks the isr has all replicas, but the isr stored in ZK is different. -- This message was sent by Atlassian JIRA (v6.3.4#6332)