FYI, Basically, the ISR returned by topic metadata response is unreliable.
This is discussed in KAFKA-1367 Thanks, Mayuresh On Mon, May 4, 2015 at 8:09 AM, Steve Miller <st...@idrathernotsay.com> wrote: > [ BTW, after some more research, I think what might be happening here is > that we had some de-facto network partitioning happen as a side-effect of > us renaming some network interfaces, though if that's the case, I'd like to > know how to get everything back into sync. ] > > Hi. I'm seeing something weird, where if I do a MetadataRequest, what > I get back says I have out-of-sync replicas... but if I use kafka-topic.sh, > it says I don't. I'm running Kafka 0.8.1.1, still, for the moment, on Java > 1.7.0_55. > > The code I have to do this uses kafka-python: > > ====== > #!/usr/bin/python > > import logging > import signal > import sys > > # Should use argparse, but we shouldn't use python 2.6, either... > from optparse import OptionParser > > import simplejson as json > > from kafka.client import KafkaClient > from kafka.protocol import KafkaProtocol > > #logging.basicConfig(level=logging.DEBUG) > > def main(): > parser = OptionParser() > parser.add_option('-t', '--topic', dest='topic', > help='topic to which we should subscribe', default='mytopic') > parser.add_option('-b', '--broker', dest='kafkaHost', > help='Kafka broker to which we should connect', > default='host309'-ilg1.rtc.vrsn.com') > > (options, args) = parser.parse_args() > > kafka = KafkaClient('%s:9092' % options.kafkaHost) > > # WARNING: terrible abuse of private methods follows. > > id = kafka._next_id() > > request = KafkaProtocol.encode_metadata_request(kafka.client_id, id) > response = kafka._send_broker_unaware_request(id, request) > > (brokers, topics) = KafkaProtocol.decode_metadata_response(response) > > if options.topic != '*': > topics_we_want = [options.topic] > else: > topics_we_want = sorted(topics.keys()) > > for topic in topics_we_want: > for partition in sorted(topics[topic].keys()): > meta = topics[topic][partition] > delta = set(meta.replicas) - set(meta.isr) > if len(delta) == 0: > print 'topic', topic, 'partition', partition, 'leader', > meta.leader, 'replicas', meta.replicas, 'isr', meta.isr > else: > print 'topic', topic, 'partition', partition, 'leader', > meta.leader, 'replicas', meta.replicas, 'isr', meta.isr, 'OUT-OF-SYNC', > delta > > sys.exit(0) > > if __name__ == "__main__": > #logging.basicConfig(level=logging.DEBUG) > main() > ====== > > And if I run that against "mytopic", I get: > > topic mytopic partition 0 leader 311 replicas (311, 323) isr (311, 323) > topic mytopic partition 1 leader 323 replicas (323, 312) isr (312, 323) > topic mytopic partition 2 leader 324 replicas (324, 313) isr (324, 313) > topic mytopic partition 3 leader 309 replicas (309, 314) isr (314, 309) > topic mytopic partition 4 leader 315 replicas (310, 315) isr (315,) > OUT-OF-SYNC set([310]) > topic mytopic partition 5 leader 311 replicas (311, 316) isr (311, 316) > topic mytopic partition 6 leader 312 replicas (312, 317) isr (317, 312) > topic mytopic partition 7 leader 318 replicas (313, 318) isr (318, 313) > topic mytopic partition 8 leader 314 replicas (314, 319) isr (314, 319) > topic mytopic partition 9 leader 315 replicas (315, 320) isr (320, 315) > topic mytopic partition 10 leader 316 replicas (316, 321) isr (316, 321) > topic mytopic partition 11 leader 317 replicas (317, 322) isr (317, 322) > topic mytopic partition 12 leader 318 replicas (318, 323) isr (318, 323) > topic mytopic partition 13 leader 324 replicas (319, 324) isr (324,) > OUT-OF-SYNC set([319]) > topic mytopic partition 14 leader 320 replicas (320, 309) isr (320, 309) > topic mytopic partition 15 leader 321 replicas (321, 310) isr (321,) > OUT-OF-SYNC set([310]) > topic mytopic partition 16 leader 312 replicas (312, 320) isr (312, 320) > topic mytopic partition 17 leader 323 replicas (323, 313) isr (323, 313) > topic mytopic partition 18 leader 324 replicas (324, 314) isr (314, 324) > topic mytopic partition 19 leader 309 replicas (309, 315) isr (309, 315) > > but if I do: > > /opt/kafka/bin/kafka-topics.sh --describe --zookeeper host301:2181 --topic > mytopic > > I get: > > Topic:mytopic PartitionCount:20 ReplicationFactor:2 > Configs:retention.bytes=100000000000 > Topic: mytopic Partition: 0 Leader: 311 Replicas: 311,323 > Isr: 311,323 > Topic: mytopic Partition: 1 Leader: 323 Replicas: 323,312 > Isr: 312,323 > Topic: mytopic Partition: 2 Leader: 324 Replicas: 324,313 > Isr: 324,313 > Topic: mytopic Partition: 3 Leader: 309 Replicas: 309,314 > Isr: 314,309 > Topic: mytopic Partition: 4 Leader: 315 Replicas: 310,315 > Isr: 315,310 > Topic: mytopic Partition: 5 Leader: 311 Replicas: 311,316 > Isr: 311,316 > Topic: mytopic Partition: 6 Leader: 312 Replicas: 312,317 > Isr: 317,312 > Topic: mytopic Partition: 7 Leader: 318 Replicas: 313,318 > Isr: 318,313 > Topic: mytopic Partition: 8 Leader: 314 Replicas: 314,319 > Isr: 314,319 > Topic: mytopic Partition: 9 Leader: 315 Replicas: 315,320 > Isr: 320,315 > Topic: mytopic Partition: 10 Leader: 316 Replicas: 316,321 > Isr: 316,321 > Topic: mytopic Partition: 11 Leader: 317 Replicas: 317,322 > Isr: 317,322 > Topic: mytopic Partition: 12 Leader: 318 Replicas: 318,323 > Isr: 318,323 > Topic: mytopic Partition: 13 Leader: 324 Replicas: 319,324 > Isr: 324,319 > Topic: mytopic Partition: 14 Leader: 320 Replicas: 320,309 > Isr: 320,309 > Topic: mytopic Partition: 15 Leader: 321 Replicas: 321,310 > Isr: 321,310 > Topic: mytopic Partition: 16 Leader: 312 Replicas: 312,320 > Isr: 312,320 > Topic: mytopic Partition: 17 Leader: 323 Replicas: 323,313 > Isr: 323,313 > Topic: mytopic Partition: 18 Leader: 324 Replicas: 324,314 > Isr: 314,324 > Topic: mytopic Partition: 19 Leader: 309 Replicas: 309,315 > Isr: 309,315 > > and if I do: > > /opt/kafka/bin/kafka-topics.sh --describe --zookeeper host301-ilg1:2181 > --under-replicated-partitions > > it prints nothing. > > Looking at a system-call trace of kafka-topics.sh, I never see it do a > MetadataRequest at all: I see it connect to ZK and I see it fishing around > in there, though. > > If I poke around in ZK manually, I see, for example (looking at partition > 4, since that's one it says is out of sync): > > [zk: localhost:2181(CONNECTED) 14] get > /brokers/topics/mytopic/partitions/4/state > > {"controller_epoch":9,"leader":315,"version":1,"leader_epoch":15,"isr":[315,310]} > cZxid = 0x100000032 > ctime = Fri Oct 31 21:20:31 UTC 2014 > mZxid = 0x44d07e3b0 > mtime = Fri Apr 17 11:44:32 UTC 2015 > pZxid = 0x100000032 > cversion = 0 > dataVersion = 27 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 81 > numChildren = 0 > > I can do the metadata request against every broker in our cluster, and I > get the same results, the same three partitions that show as out of sync. > I can also get that key from ZK on all our ZK instances and I get the same > basic thing as above. > > Looking at the filesystem on 310, the one that has mytopic-4 in it, I see > log segments being updated there with the current time on them, so > something's writing there at least -- which doesn't preclude it from being > a bit behind, I suppose, but it's not like the mod-times are last February. > (-: > > Here's what I see in state-change.log for 'mytopic,4' on broker 315: > > state-change.log:[2015-04-16 13:26:42,424] TRACE Broker 315 received > LeaderAndIsr request > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > correlation id 0 from controller 312 epoch 9 for partition [mytopic,4] > (state.change.logger) > state-change.log:[2015-04-16 13:26:42,430] WARN Broker 315 received > invalid LeaderAndIsr request with correlation id 0 from controller 312 > epoch 9 with an older leader epoch 14 for partition [mytopic,4], current > leader epoch is 14 (state.change.logger) > state-change.log:[2015-04-16 13:26:42,702] TRACE Broker 315 cached leader > info > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 0 (state.change.logger) > state-change.log:[2015-04-16 13:26:55,541] TRACE Broker 315 cached leader > info > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 3 (state.change.logger) > state-change.log:[2015-04-17 11:42:52,215] TRACE Broker 315 received > LeaderAndIsr request > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > correlation id 503 from controller 312 epoch 9 for partition [mytopic,4] > (state.change.logger) > state-change.log:[2015-04-17 11:42:52,215] TRACE Broker 315 handling > LeaderAndIsr request correlationId 503 from controller 312 epoch 9 starting > the become-leader transition for partition [mytopic,4] (state.change.logger) > state-change.log:[2015-04-17 11:42:52,216] TRACE Broker 315 stopped > fetchers as part of become-leader request from controller 312 epoch 9 with > correlation id 503 for partition [mytopic,4] (state.change.logger) > state-change.log:[2015-04-17 11:42:52,216] TRACE Broker 315 completed > LeaderAndIsr request correlationId 503 from controller 312 epoch 9 for the > become-leader transition for partition [mytopic,4] (state.change.logger) > state-change.log:[2015-04-17 11:42:52,217] TRACE Broker 315 cached leader > info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 503 (state.change.logger) > state-change.log:[2015-04-17 11:43:06,187] TRACE Broker 315 received > LeaderAndIsr request > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > correlation id 547 from controller 312 epoch 9 for partition [mytopic,4] > (state.change.logger) > state-change.log:[2015-04-17 11:43:06,187] WARN Broker 315 received > invalid LeaderAndIsr request with correlation id 547 from controller 312 > epoch 9 with an older leader epoch 15 for partition [mytopic,4], current > leader epoch is 15 (state.change.logger) > state-change.log:[2015-04-17 11:43:06,212] TRACE Broker 315 cached leader > info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 547 (state.change.logger) > state-change.log:[2015-04-17 11:44:30,347] TRACE Broker 315 cached leader > info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 549 (state.change.logger) > > and for 312: > > [2015-04-17 11:42:52,207] TRACE Controller 312 epoch 9 started leader > election for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,210] TRACE Controller 312 epoch 9 elected leader 315 > for Offline partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 changed partition > [mytopic,4] from OnlinePartition to OnlinePartition with leader 315 > (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > become-follower LeaderAndIsr request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 310 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > become-leader LeaderAndIsr request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 315 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 322 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 313 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 316 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 319 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 310 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 309 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 318 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 312 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 321 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 315 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 324 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 323 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 317 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 311 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Broker 312 cached leader info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 503 (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 320 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 503 to broker 314 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,043] TRACE Controller 312 epoch 9 changed state of > replica 310 for partition [mytopic,4] from OnlineReplica to OfflineReplica > (state.change.logger) > [2015-04-17 11:43:06,186] TRACE Controller 312 epoch 9 sending > become-leader LeaderAndIsr request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 315 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,188] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 322 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,190] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 313 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,193] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 316 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,195] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 319 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,197] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 309 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,199] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 318 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,201] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 312 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,203] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 321 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,205] TRACE Broker 312 cached leader info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 547 (state.change.logger) > [2015-04-17 11:43:06,206] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 315 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,208] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 324 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,211] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 323 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,213] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 317 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,215] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 311 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,217] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 320 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:43:06,219] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 547 to broker 314 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,292] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 548 to broker 310 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,310] TRACE Controller 312 epoch 9 changed state of > replica 310 for partition [mytopic,4] from OfflineReplica to OnlineReplica > (state.change.logger) > [2015-04-17 11:44:30,317] TRACE Controller 312 epoch 9 sending > become-follower LeaderAndIsr request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 310 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,320] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 322 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,322] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 313 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,324] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 316 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,327] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 319 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,329] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 310 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,331] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 309 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,334] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 318 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,336] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 312 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,338] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 321 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,340] TRACE Broker 312 cached leader info > (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 549 (state.change.logger) > [2015-04-17 11:44:30,341] TRACE Controller 312 epoch 9 sending > UpdateMetadata request > (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId > 549 to broker 315 for partition [mytopic,4] (state.change.logger) > [2015-04-17 11:44:30,344] TRACE Controller 312 epoch 9 sending > UpdateMetadata request (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) > > and for 310: > > [2015-04-16 13:26:42,406] TRACE Broker 310 received LeaderAndIsr request > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > correlation id 0 from controller 312 epoch 9 for partition [mytopic,4] > (state.change.logger) > [2015-04-16 13:26:42,410] WARN Broker 310 received invalid LeaderAndIsr > request with correlation id 0 from controller 312 epoch 9 with an older > leader epoch 14 for partition [mytopic,4], current leader epoch is 14 > (state.change.logger) > [2015-04-16 13:26:42,556] TRACE Broker 310 cached leader info > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > controller 312 epoch 9 with correlation id 0 (state.change.logger) > [2015-04-16 13:26:55,776] TRACE Broker 310 cached leader info > (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315) > for partition [mytopic,4] in response to UpdateMetadata request sent by > control > > Which one is right? Should I not be using MetadataRequests to figure out > who is and isn't in sync? If there's something else > > -Steve > -- -Regards, Mayuresh R. Gharat (862) 250-7125