I think we can certainly improve the logging, but this is normal when the leader tries to shrink/expand the ISR when the controller is in the middle of shrinking ISR or electing a new leader for the same partition.
Could you please file a JIRA to improve the quality of logging in this case? Thanks, Neha On Fri, Aug 23, 2013 at 10:28 AM, Yu, Libo <libo...@citi.com> wrote: > Hi team, > > During normal operation, all of a sudden, we found many exceptions in the > log like this: > > It seems one thread' zookeeper's data is written unexpectedly by some > other thread. > Any expertise will be appreciated. > > [2013-08-23 13:17:00,622] INFO Partition [our.own.topic > one.default,0] on broker 1: Cached zkVersion [4] not equal to that in zo > okeeper, skip updating ISR (kafka.cluster.Partition) > [2013-08-23 13:17:00,622] INFO Partition [our.own.topic.two.default,2] > on broker 1: Shrinking ISR for partition [our.own.topic.two,2] from 1,2,3 > to 1 (kafka.cluster.Partition) > [2013-08-23 13:17:00,623] ERROR Conditional update of path /brokers/t > opics/our.own.topic.two/partitions/2/state with d > ata { "controller_epoch":81, "isr":[ 1 ], "leader":1, "leader_epoch": > 0, "version":1 } and expected version 10 failed (kafka.utils.ZkUtils$ > ) > org.I0Itec.zkclient.exception.ZkBadVersionException: org.apache.zooke > eper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for > /brokers/topics/our.own.topic.two/partitions/2/state > at > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:51) > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) > at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:809) > at > kafka.utils.ZkUtils$.conditionalUpdatePersistentPath(ZkUtils.scala:330) > at kafka.cluster.Partition.updateIsr(Partition.scala:347) > at kafka.cluster.Partition.maybeShrinkIsr(Partition.scala:291) > at > kafka.server.ReplicaManager$$anonfun$kafka$server$ReplicaManager$$maybeShrinkIsr$2.apply(ReplicaManager.scala:285) > at > kafka.server.ReplicaManager$$anonfun$kafka$server$ReplicaManager$$maybeShrinkIsr$2.apply(ReplicaManager.scala:285) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:61) > at > kafka.server.ReplicaManager.kafka$server$ReplicaManager$$maybeShrinkIsr(ReplicaManager.scala:285) > at > kafka.server.ReplicaManager$$anonfun$startup$1.apply$mcV$sp(ReplicaManager.scala:108) > at kafka.utils.Utils$$anon$2.run(Utils.scala:67) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > Caused by: org.apache.zookeeper.KeeperException$BadVersionException: > KeeperErrorCode = BadVersion for > /brokers/topics/our.own.topic.two/partitions/2/state > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:106) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1044) > at > org.I0Itec.zkclient.ZkConnection.writeData(ZkConnection.java:111) > at org.I0Itec.zkclient.ZkClient$10.call(ZkClient.java:813) > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > ... 18 more > > Regards, > > Libo > >