Hi team, During normal operation, all of a sudden, we found many exceptions in the log like this:
It seems one thread' zookeeper's data is written unexpectedly by some other thread. Any expertise will be appreciated. [2013-08-23 13:17:00,622] INFO Partition [our.own.topic one.default,0] on broker 1: Cached zkVersion [4] not equal to that in zo okeeper, skip updating ISR (kafka.cluster.Partition) [2013-08-23 13:17:00,622] INFO Partition [our.own.topic.two.default,2] on broker 1: Shrinking ISR for partition [our.own.topic.two,2] from 1,2,3 to 1 (kafka.cluster.Partition) [2013-08-23 13:17:00,623] ERROR Conditional update of path /brokers/t opics/our.own.topic.two/partitions/2/state with d ata { "controller_epoch":81, "isr":[ 1 ], "leader":1, "leader_epoch": 0, "version":1 } and expected version 10 failed (kafka.utils.ZkUtils$ ) org.I0Itec.zkclient.exception.ZkBadVersionException: org.apache.zooke eper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/our.own.topic.two/partitions/2/state at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:51) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:809) at kafka.utils.ZkUtils$.conditionalUpdatePersistentPath(ZkUtils.scala:330) at kafka.cluster.Partition.updateIsr(Partition.scala:347) at kafka.cluster.Partition.maybeShrinkIsr(Partition.scala:291) at kafka.server.ReplicaManager$$anonfun$kafka$server$ReplicaManager$$maybeShrinkIsr$2.apply(ReplicaManager.scala:285) at kafka.server.ReplicaManager$$anonfun$kafka$server$ReplicaManager$$maybeShrinkIsr$2.apply(ReplicaManager.scala:285) at scala.collection.mutable.HashSet.foreach(HashSet.scala:61) at kafka.server.ReplicaManager.kafka$server$ReplicaManager$$maybeShrinkIsr(ReplicaManager.scala:285) at kafka.server.ReplicaManager$$anonfun$startup$1.apply$mcV$sp(ReplicaManager.scala:108) at kafka.utils.Utils$$anon$2.run(Utils.scala:67) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/our.own.topic.two/partitions/2/state at org.apache.zookeeper.KeeperException.create(KeeperException.java:106) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1044) at org.I0Itec.zkclient.ZkConnection.writeData(ZkConnection.java:111) at org.I0Itec.zkclient.ZkClient$10.call(ZkClient.java:813) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) ... 18 more Regards, Libo