This looks like a leader broker somehow did not respond to a fetch request from the follower. It may be because the broker was too busy. If that is the case, Xiao¹s approach could help - reassign partitions or reelect leaders to balance the traffic among brokers.
Jiangjie (Becket) Qin On 3/9/15, 8:31 PM, "sy.pan" <shengyi....@gmail.com> wrote: >Hi, tao xiao and Jiangjie Qin > >I encounter with the same issue, my node had recovered from high load >problem (caused by other application) > >this is the kafka-topic show: > >Topic:ad_click_sts PartitionCount:6 ReplicationFactor:2 Configs: > Topic: ad_click_sts Partition: 0 Leader: 1 Replicas: 1,0 > Isr: 1 > Topic: ad_click_sts Partition: 1 Leader: 0 Replicas: 0,1 > Isr: 0 > Topic: ad_click_sts Partition: 2 Leader: 1 Replicas: 1,0 > Isr: 1 > Topic: ad_click_sts Partition: 3 Leader: 0 Replicas: 0,1 > Isr: 0 > Topic: ad_click_sts Partition: 4 Leader: 1 Replicas: 1,0 > Isr: 1 > Topic: ad_click_sts Partition: 5 Leader: 0 Replicas: 0,1 > Isr: 0 > >ReplicaFetcherThread info extracted from kafka server.log : > >[2015-03-09 21:06:05,450] ERROR [ReplicaFetcherThread-0-0], Error in >fetch Name: FetchRequest; Version: 0; CorrelationId: 7331; ClientId: >ReplicaFetcherThread-0-0; ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 >bytes; RequestInfo: [ad_click_sts,5] -> >PartitionFetchInfo(6149699,1048576),[ad_click_sts,3] -> >PartitionFetchInfo(6147835,1048576),[ad_click_sts,1] -> >PartitionFetchInfo(6235071,1048576) (kafka.server.ReplicaFetcherThread) >java.net.SocketTimeoutException > at >sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201) > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86) > ŠŠ.. > at >kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer >.scala:108) > at >kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala: >108) > at >kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala: >108) > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) > at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107) > at >kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThre >ad.scala:96) > at >kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) > >[2015-03-09 21:06:05,450] WARN Reconnect due to socket error: null >(kafka.consumer.SimpleConsumer) > >[2015-03-09 21:05:57,116] INFO Partition [ad_click_sts,4] on broker 1: >Cached zkVersion [556] not equal to that in zookeeper, skip updating ISR >(kafka.cluster.Partition) > >[2015-03-09 21:06:05,772] INFO Partition [ad_click_sts,2] on broker 1: >Shrinking ISR for partition [ad_click_sts,2] from 1,0 to 1 >(kafka.cluster.Partition) > > >How to fix this Isr problem ? Is there some command can be run ? > >Regards >sy.pan