[ 
https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-911:
--------------------------------

    Attachment: kafka-911-v1.patch

The root cause of this bug was a race condition while using the 
ControllerBrokerRequestBatch that assumes synchronization at the caller. This 
patch synchronizes access to the ControllerBrokerRequestBatch while sending out 
the StopReplicaRequest. 

While working on this and testing the patch, I noticed some inefficiency with 
the controller shutdown API. When we move the leader for a partition one at a 
time, we also remove the broker being shutdown from the ISR. This involves at 
least one read and one write per partition to zookeeper, sometimes more. 
Besides being slow, this operation is not effective. This is because the broker 
being shutdown still has alive fetchers to the new leader before it receives 
and acts on the StopReplicaRequest. So the new leader adds it back to the ISR 
anyways. 

Then I thought I could move it to the end of the controlled shutdown API where 
we check if all partitions are successfully moved, then send StopReplicaRequest 
to the broker being shut down. Right after this, we could move the replica to 
Offline and remove it from ISR as part of that operation. Even if we can invoke 
this operation as a batch, it will still be slow since the zookeeper 
reads/writes will happen serially. Also, I realized this is not required as 
well for 2 reasons -

1. Since the shutting broker is sent the StopReplicaRequest, it will stop the 
fetcher and fall out of ISR. Until then, as long as the controller doesn't 
failover, it will not be elected as the leader, even if it is in the ISR, since 
it is one of the shuttingDownBrokers.

2. Even if the controller fails over, by the time the new controller has 
initialized and is ready to serve, the StopReplicaRequest would've ensured that 
the shutdown broker is no longer in the ISR. And until the controller fails 
over, there cannot be any leader election anyway.

I've tested this patch on a 7 node cluster that continuously gets rolling 
bounced and has ~100 producers sending production data to it with ~1000 
consumers consuming data from it. 
                
> Bug in controlled shutdown logic in controller leads to controller not 
> sending out some state change request 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-911
>                 URL: https://issues.apache.org/jira/browse/KAFKA-911
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: kafka-0.8, p1
>         Attachments: kafka-911-v1.patch
>
>
> The controlled shutdown logic in the controller first tries to move the 
> leaders from the broker being shutdown. Then it tries to remove the broker 
> from the isr list. During that operation, it does not synchronize on the 
> controllerLock. This causes a race condition while dispatching data using the 
> controller's channel manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to