[ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110748#comment-15110748
 ] 

Flavio Junqueira commented on KAFKA-3083:
-----------------------------------------

Sure, we need to transform all operations to look like what we currently have 
in ZKCheckedEphemeral. That particular class is a bit special because it 
performs checks and such, but essentially we need to change the current calls 
in ZkUtils to use asynchronous calls using the ZK handle directly and have a 
callback class that pairs up with the call.

Related to this present issue, we will also need to implement session 
management, but this time it can't try to be transparent like ZkClient does. It 
is good to have a central point to get the current zk handle from, but we need 
to give the broker the ability to signal when to create a new session. As part 
of this signaling, we will need to implement some kind of listener to propagate 
events. Another option is to let the broker implement directly a Watcher to 
process event notifications.

One simple way to start is to replace gradually the calls in ZkUtils with 
asynchronous calls, still using the handle ZkUtils provide. The calls would 
block to maintain the current behavior outside ZkUtils. Once that's done, we 
can make the calls non-blocking and do the necessary changes across 
broker/controller. Finally, we can replace the session management with our own 
last.

If you guys want to do this, then we should probably create an umbrella jira.   

> a soft failure in controller may leave a topic partition in an inconsistent 
> state
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-3083
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3083
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.9.0.0
>            Reporter: Jun Rao
>            Assignee: Mayuresh Gharat
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to