[ https://issues.apache.org/jira/browse/KAFKA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-1155: --------------------------------- Affects Version/s: 0.8.2.0 > Kafka server can miss zookeeper watches during long zkclient callbacks > ---------------------------------------------------------------------- > > Key: KAFKA-1155 > URL: https://issues.apache.org/jira/browse/KAFKA-1155 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.8.0, 0.8.1, 0.8.2.0 > Reporter: Neha Narkhede > Assignee: Neha Narkhede > Priority: Critical > Labels: newbie++ > > On getting a zookeeper watch, zkclient invokes the blocking user callback and > only re-registers the watch after the callback returns. This leaves a > possibly large window of time when Kafka has not registered for watches on > the desired zookeeper paths and hence can miss important state changes (on > the controller). In any case, it is worth noting that even though zookeeper > has a read-and-set-watch API, there can always be a window of time between > the watch being fired, the callback and the read-and-set-watch API call. Due > to the zkclient wrapper, it is difficult to handle this properly in the Kafka > code unless we directly use the zookeeper client. One way of getting around > this issue is to use timestamps on the paths and when a watch fires, check if > the timestamp in zk is different from the one in the callback handler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)