Could you check the state-change log of the follower replica and see if it received the corresponding LeaderAndIsr request? If so, could you check the max lag jmx (http://kafka.apache.org/documentation.html) in the follower replica to see what the lag is?
Thanks, Jun On Thu, Nov 27, 2014 at 4:03 AM, Shangan Chen <chenshangan...@gmail.com> wrote: > my kafka version is kafka_2.10-0.8.1.1.jar > > *state-change log:* > > [2014-11-25 02:30:19,290] TRACE Controller 29 epoch 7 sending > UpdateMetadata request > (Leader:29,ISR:29,24,LeaderEpoch:10,ControllerEpoch:4) with correlationId > 1803 to broker 20 for partition [org.nginx,32] (state.change.logger) > > *controller log:* > > [2014-11-22 09:17:02,327] [org.nginx,32] -> > (Leader:29,ISR:29,24,LeaderEpoch:10,ControllerEpoch:4) > > *partition state in zookeeper:* > > [zk: localhost:2181(CONNECTED) 4] get > /kafka08/brokers/topics/org.nginx/partitions/32/state > {"controller_epoch":6,"leader":29,"version":1,"leader_epoch":11,"isr":[29]} > cZxid = 0x5641824ee > ctime = Fri Oct 10 12:53:47 CST 2014 > mZxid = 0x5a4c870b8 > mtime = Sat Nov 22 06:20:27 CST 2014 > pZxid = 0x5641824ee > cversion = 0 > dataVersion = 19 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 75 > numChildren = 0 > > > Based on the above information, controller and state change log has the > right information, but partition state in zookeeper was not updated and > never try to update. > > > > > On Tue, Nov 25, 2014 at 1:28 PM, Jun Rao <jun...@gmail.com> wrote: > > > Which version of Kafka are you using? Any error in the controller and the > > state-change log? > > > > Thanks, > > > > Jun > > > > On Fri, Nov 21, 2014 at 5:59 PM, Shangan Chen <chenshangan...@gmail.com> > > wrote: > > > > > In the initial state all replicas are in isr list, but sometimes when I > > > check the topic state, the replica can never become isr even if > actually > > it > > > is synchronized. I saw in the log, the leader print expand isr > > request,but > > > did not work. I found a interesting thing, the shrink and expand > request > > > happened just after the controller switch. I don't know whether it is > > > related, and the controller log is overwrite, so I can not verify. Is > > there > > > anything I can do to trigger the isr update? Currently, I alter the > > > zookeeper partition state, and it works, but it really need a lot of > > manual > > > work to do as I have quite a lot of topics in my cluster. Some useful > > > information is as follows. > > > > > > *my replica lag config for default:* > > > > > > replica.lag.time.max.ms=10000 > > > replica.lag.max.messages=4000 > > > > > > *controller info:* > > > > > > [zk: localhost:2181(CONNECTED) 4] get /kafka08/controller > > > {"version":1,"brokerid":29,"timestamp":"1416608404008"} > > > cZxid = 0x5a4c85923 > > > ctime = Sat Nov 22 06:20:04 CST 2014 > > > mZxid = 0x5a4c85923 > > > mtime = Sat Nov 22 06:20:04 CST 2014 > > > pZxid = 0x5a4c85923 > > > cversion = 0 > > > dataVersion = 0 > > > aclVersion = 0 > > > ephemeralOwner = 0x5477ba622cb6c7d > > > dataLength = 55 > > > numChildren = 0 > > > > > > > > > *topic info:* > > > > > > Topic:org.nginx PartitionCount:48 ReplicationFactor:2 > Configs: > > > Topic: org.nginx Partition: 0 Leader: 17 > Replicas: > > > 17,32 Isr: 17,32 > > > Topic: org.nginx Partition: 1 Leader: 18 > Replicas: > > > 18,33 Isr: 18,33 > > > Topic: org.nginx Partition: 2 Leader: 19 > Replicas: > > > 19,34 Isr: 34,19 > > > Topic: org.nginx Partition: 3 Leader: 20 > Replicas: > > > 20,35 Isr: 35,20 > > > Topic: org.nginx Partition: 4 Leader: 21 > Replicas: > > > 21,36 Isr: 21,36 > > > Topic: org.nginx Partition: 5 Leader: 22 > Replicas: > > > 22,17 Isr: 17,22 > > > Topic: org.nginx Partition: 6 Leader: 23 > Replicas: > > > 23,18 Isr: 18,23 > > > Topic: org.nginx Partition: 7 Leader: 24 > Replicas: > > > 24,19 Isr: 24,19 > > > Topic: org.nginx Partition: 8 Leader: 25 > Replicas: > > > 25,20 Isr: 25,20 > > > Topic: org.nginx Partition: 9 Leader: 26 > Replicas: > > > 26,21 Isr: 26,21 > > > Topic: org.nginx Partition: 10 Leader: 27 > Replicas: > > > 27,22 Isr: 27,22 > > > Topic: org.nginx Partition: 11 Leader: 28 > Replicas: > > > 28,23 Isr: 28,23 > > > Topic: org.nginx Partition: 12 Leader: 29 > Replicas: > > > 29,24 Isr: 29 > > > Topic: org.nginx Partition: 13 Leader: 30 > Replicas: > > > 30,25 Isr: 30,25 > > > Topic: org.nginx Partition: 14 Leader: 31 > Replicas: > > > 31,26 Isr: 26,31 > > > Topic: org.nginx Partition: 15 Leader: 32 > Replicas: > > > 32,27 Isr: 27,32 > > > Topic: org.nginx Partition: 16 Leader: 33 > Replicas: > > > 33,28 Isr: 33,28 > > > Topic: org.nginx Partition: 17 Leader: 34 > Replicas: > > > 34,29 Isr: 29,34 > > > Topic: org.nginx Partition: 18 Leader: 35 > Replicas: > > > 35,30 Isr: 30,35 > > > Topic: org.nginx Partition: 19 Leader: 36 > Replicas: > > > 36,31 Isr: 31,36 > > > Topic: org.nginx Partition: 20 Leader: 17 > Replicas: > > > 17,32 Isr: 17,32 > > > Topic: org.nginx Partition: 21 Leader: 18 > Replicas: > > > 18,33 Isr: 18,33 > > > Topic: org.nginx Partition: 22 Leader: 19 > Replicas: > > > 19,34 Isr: 34,19 > > > Topic: org.nginx Partition: 23 Leader: 20 > Replicas: > > > 20,35 Isr: 35,20 > > > Topic: org.nginx Partition: 24 Leader: 21 > Replicas: > > > 21,36 Isr: 21,36 > > > Topic: org.nginx Partition: 25 Leader: 22 > Replicas: > > > 22,17 Isr: 17,22 > > > Topic: org.nginx Partition: 26 Leader: 23 > Replicas: > > > 23,18 Isr: 18,23 > > > Topic: org.nginx Partition: 27 Leader: 24 > Replicas: > > > 24,19 Isr: 24,19 > > > Topic: org.nginx Partition: 28 Leader: 25 > Replicas: > > > 25,20 Isr: 25,20 > > > Topic: org.nginx Partition: 29 Leader: 26 > Replicas: > > > 26,21 Isr: 26,21 > > > Topic: org.nginx Partition: 30 Leader: 27 > Replicas: > > > 27,22 Isr: 27,22 > > > Topic: org.nginx Partition: 31 Leader: 28 > Replicas: > > > 28,23 Isr: 28,23 > > > Topic: org.nginx Partition: 32 Leader: 29 > Replicas: > > > 29,24 Isr: 29 > > > Topic: org.nginx Partition: 33 Leader: 30 > Replicas: > > > 30,25 Isr: 30,25 > > > Topic: org.nginx Partition: 34 Leader: 31 > Replicas: > > > 31,26 Isr: 26,31 > > > Topic: org.nginx Partition: 35 Leader: 32 > Replicas: > > > 32,27 Isr: 27,32 > > > Topic: org.nginx Partition: 36 Leader: 33 > Replicas: > > > 33,28 Isr: 33,28 > > > Topic: org.nginx Partition: 37 Leader: 34 > Replicas: > > > 34,29 Isr: 29,34 > > > Topic: org.nginx Partition: 38 Leader: 35 > Replicas: > > > 35,30 Isr: 30,35 > > > Topic: org.nginx Partition: 39 Leader: 36 > Replicas: > > > 36,31 Isr: 31,36 > > > Topic: org.nginx Partition: 40 Leader: 17 > Replicas: > > > 17,32 Isr: 17,32 > > > Topic: org.nginx Partition: 41 Leader: 18 > Replicas: > > > 18,33 Isr: 33,18 > > > Topic: org.nginx Partition: 42 Leader: 19 > Replicas: > > > 19,34 Isr: 34,19 > > > Topic: org.nginx Partition: 43 Leader: 20 > Replicas: > > > 20,35 Isr: 35,20 > > > Topic: org.nginx Partition: 44 Leader: 21 > Replicas: > > > 21,36 Isr: 21,36 > > > Topic: org.nginx Partition: 45 Leader: 22 > Replicas: > > > 22,17 Isr: 17,22 > > > Topic: org.nginx Partition: 46 Leader: 23 > Replicas: > > > 23,18 Isr: 18,23 > > > Topic: org.nginx Partition: 47 Leader: 24 > Replicas: > > > 24,19 Isr: 24,19 > > > > > > -- > > > have a good day! > > > chenshang'an > > > > > > > > > -- > have a good day! > chenshang'an >