@tao xiao and Jiangjie Qin, Thank you very much I try to run kafka-reassign-partitions.sh, but the issue still exists…
this the log info: [2015-03-11 11:00:40,086] ERROR Conditional update of path /brokers/topics/ad_click_sts/partitions/4/state with data {"controller_epoch":23,"leader":1,"version":1,"leader_epoch":35,"isr":[1,0]} and expected version 564 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/ad_click_sts/partitions/4/state (kafka.utils.ZkUtils$) [2015-03-11 11:00:40,086] INFO Partition [ad_click_sts,4] on broker 1: Cached zkVersion [564] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) >>>>>>>>>>>>>>>>>> finally, I had to restart the kafka node and the Isr problem is fixed, is there any better ways? Regards sy.pan > 在 2015年3月11日,03:34,Jiangjie Qin <j...@linkedin.com.INVALID> 写道: > > This looks like a leader broker somehow did not respond to a fetch request > from the follower. It may be because the broker was too busy. If that is > the case, Xiao¹s approach could help - reassign partitions or reelect > leaders to balance the traffic among brokers. > > Jiangjie (Becket) Qin > > On 3/9/15, 8:31 PM, "sy.pan" <shengyi....@gmail.com > <mailto:shengyi....@gmail.com>> wrote: > >> Hi, tao xiao and Jiangjie Qin >> >> I encounter with the same issue, my node had recovered from high load >> problem (caused by other application) >> >> this is the kafka-topic show: >> >> Topic:ad_click_sts PartitionCount:6 ReplicationFactor:2 Configs: >> Topic: ad_click_sts Partition: 0 Leader: 1 Replicas: 1,0 >> Isr: 1 >> Topic: ad_click_sts Partition: 1 Leader: 0 Replicas: 0,1 >> Isr: 0 >> Topic: ad_click_sts Partition: 2 Leader: 1 Replicas: 1,0 >> Isr: 1 >> Topic: ad_click_sts Partition: 3 Leader: 0 Replicas: 0,1 >> Isr: 0 >> Topic: ad_click_sts Partition: 4 Leader: 1 Replicas: 1,0 >> Isr: 1 >> Topic: ad_click_sts Partition: 5 Leader: 0 Replicas: 0,1 >> Isr: 0 >> >> ReplicaFetcherThread info extracted from kafka server.log : >> >> [2015-03-09 21:06:05,450] ERROR [ReplicaFetcherThread-0-0], Error in >> fetch Name: FetchRequest; Version: 0; CorrelationId: 7331; ClientId: >> ReplicaFetcherThread-0-0; ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 >> bytes; RequestInfo: [ad_click_sts,5] -> >> PartitionFetchInfo(6149699,1048576),[ad_click_sts,3] -> >> PartitionFetchInfo(6147835,1048576),[ad_click_sts,1] -> >> PartitionFetchInfo(6235071,1048576) (kafka.server.ReplicaFetcherThread) >> java.net.SocketTimeoutException >> at >> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201) >> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86) >> ŠŠ.. >> at >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer >> .scala:108) >> at >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala: >> 108) >> at >> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala: >> 108) >> at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) >> at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) >> at >> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThre >> ad.scala:96) >> at >> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88) >> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) >> >> [2015-03-09 21:06:05,450] WARN Reconnect due to socket error: null >> (kafka.consumer.SimpleConsumer) >> >> [2015-03-09 21:05:57,116] INFO Partition [ad_click_sts,4] on broker 1: >> Cached zkVersion [556] not equal to that in zookeeper, skip updating ISR >> (kafka.cluster.Partition) >> >> [2015-03-09 21:06:05,772] INFO Partition [ad_click_sts,2] on broker 1: >> Shrinking ISR for partition [ad_click_sts,2] from 1,0 to 1 >> (kafka.cluster.Partition) >> >> >> How to fix this Isr problem ? Is there some command can be run ? >> >> Regards >> sy.pan