[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736988#comment-13736988
 ] 

Jun Rao commented on KAFKA-992:
-------------------------------

Thanks for patch v8. I think the code can still be made cleaner.

80. If you look at the code in ZkUtils.registerBrokerInZk(), 
ZookeeperConsumerConnecter.registerConsumerInZK() and 
ZookeeperLeaderEelection.elect(), they all have the logic for handling the ZK 
bug. They only differ slightly because the way that they check whether the 
registration is from the same client is different. I was thinking that we can 
write a new util function called sth like 
createEphemeralPathExpectConflictHandleZKBug(). This function will take a 
function that checks if the value in a ZK path is from the caller. The function 
will then keep trying to create the path until either it detects a value is put 
in by a different caller or the creation succeeds. We will get several benefits 
if you do that: (1) there is a centralized place to handle the ZK bug and 
therefore we avoid code duplication; (2) this separates the logic of handling 
the ZK bug from the rest of the logic in the caller, which will make the latter 
easier to understand; (3) it makes it easier to remove the logic in the future 
when the ZK bug is fixed.

81. In ZookeeperLeaderEelection.elect(), we also have the logic to handle 
different formats of the value of the controller path. It seems that can 
probably be simplified a bit too. Basically, if we read the old format (in the 
new code), we can treat it as if someone else already did the registration.

82. There is code duplication in ZkUtils.getController() and 
ZookeeperLeaderElection.LeaderChangeListener.handleDataChange(). Could we share 
the logic in a separate util?
                
> Double Check on Broker Registration to Avoid False NodeExist Exception
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-992
>                 URL: https://issues.apache.org/jira/browse/KAFKA-992
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Neha Narkhede
>            Assignee: Guozhang Wang
>         Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to