[ 
https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232535#comment-15232535
 ] 

Flavio Junqueira edited comment on KAFKA-3042 at 4/9/16 1:09 PM:
-----------------------------------------------------------------

Here is what I've been able to find out so far based on the logs that 
[~wushujames] posted: 

# The "Cached zkVersion..." messages are printed in {{Partition.updateIsr}}. 
The {{updatedIsr}} in the logs seem to be called from 
{{Partition.maybeShrinkIsr}}.
# {{Partition.maybeShrinkIsr}} is called from 
{{ReplicaManager.maybeShrinkIsr}}, which is called periodically according to 
schedule call in {{ReplicaManager.startup}}. This is the main reason we see 
those messages periodically coming up.
# In {{Partition.maybeShrinkIsr}}, the ISR is only updated if the leader 
replica is the broker itself, which is determined by the variable 
{{leaderReplicaIdOpt}}.

It looks like {{leaderReplicaIdOpt}} isn't being updated correctly, and it is 
possible that it is due to a race with either the controllers or the execution 
of {{LeaderAndIsr}} requests.



was (Author: fpj):
Here is what I've been able to find out so far based on the logs that 
[~wushujames] posted: 

# The "Cached zkVersion..." messages are printed in {{Partition.updateIsr}}. 
The {{updatedIsr}} in the logs seem to be called from 
{{Partition.maybeShrinkIsr}}.
# {{Partition.maybeShrinkIsr}} is called from 
{{ReplicaManager.maybeShrinkIsr}}, which is called periodically according to 
schedule call in {{ReplicaManager.startup}}. This is the main reason we see 
those messages periodically coming up.
# In {{Partition.maybeShrinkIsr}}, the ISR is only updated if the leader 
replica is the broker itself, which is determined by the variable 
{{leaderReplicaIdOpt}.

It looks like {{leaderReplicaIdOpt}} isn't being updated correctly, and it is 
possible that it is due to a race with either the controllers or the execution 
of {{LeaderAndIsr} requests.


> updateIsr should stop after failed several times due to zkVersion issue
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-3042
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3042
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.2.1
>         Environment: jdk 1.7
> centos 6.4
>            Reporter: Jiahongchao
>         Attachments: controller.log, server.log.2016-03-23-01, 
> state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's 
> a follower.
> So after several failed tries, it need to find out who is the leader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to