[ https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404901#comment-15404901 ]
Jun Rao commented on KAFKA-3042: -------------------------------- [~onurkaraman], there are a couple of things. 1. Currently when a broker starts up, it expects the very first LeaderAndIsrRequest to contain all the partitions hosted on this broker. After that, we read the last checkpointed high watermark and start the high watermark checkpoint thread. If we combine UpdateMetadataRequest and LeaderAndIsrRequest, the very first request that a broker receives could be an UpdateMetadataRequest including partitions not hosted on this broker. Then, we may checkpoint high watermarks on incorrect partitions. 2. Currently, LeaderAndIsrRequest is used to inform replicas about the new leader and is only sent to brokers storing the partition. UpdateMetadataRequest is used for updating the metadata cache for the clients and is sent to every broker. Technically, they are for different things. So, using separate requests makes logical sense. We could use a single request to do both. Not sure if this makes it clearer or more confusing from a debugging perspective. In any case, there will be significant code changes to do this. I am not opposed to that. I just think that if we want to do that, we probably want to think through how to improve the controller logic holistically since there are other known pain points in the controller. > updateIsr should stop after failed several times due to zkVersion issue > ----------------------------------------------------------------------- > > Key: KAFKA-3042 > URL: https://issues.apache.org/jira/browse/KAFKA-3042 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.2.1 > Environment: jdk 1.7 > centos 6.4 > Reporter: Jiahongchao > Fix For: 0.10.1.0 > > Attachments: controller.log, server.log.2016-03-23-01, > state-change.log > > > sometimes one broker may repeatly log > "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR" > I think this is because the broker consider itself as the leader in fact it's > a follower. > So after several failed tries, it need to find out who is the leader -- This message was sent by Atlassian JIRA (v6.3.4#6332)