FYI,

Basically, the ISR returned by topic metadata response is unreliable.

This is discussed in KAFKA-1367

Thanks,


Mayuresh

On Mon, May 4, 2015 at 8:09 AM, Steve Miller <st...@idrathernotsay.com>
wrote:

> [ BTW, after some more research, I think what might be happening here is
> that we had some de-facto network partitioning happen as a side-effect of
> us renaming some network interfaces, though if that's the case, I'd like to
> know how to get everything back into sync. ]
>
>    Hi.  I'm seeing something weird, where if I do a MetadataRequest, what
> I get back says I have out-of-sync replicas... but if I use kafka-topic.sh,
> it says I don't.  I'm running Kafka 0.8.1.1, still, for the moment, on Java
> 1.7.0_55.
>
>    The code I have to do this uses kafka-python:
>
> ======
> #!/usr/bin/python
>
> import logging
> import signal
> import sys
>
> # Should use argparse, but we shouldn't use python 2.6, either...
> from optparse import OptionParser
>
> import simplejson as json
>
> from kafka.client import KafkaClient
> from kafka.protocol import KafkaProtocol
>
> #logging.basicConfig(level=logging.DEBUG)
>
> def main():
>     parser = OptionParser()
>     parser.add_option('-t', '--topic', dest='topic',
>       help='topic to which we should subscribe', default='mytopic')
>     parser.add_option('-b', '--broker', dest='kafkaHost',
>       help='Kafka broker to which we should connect',
>       default='host309'-ilg1.rtc.vrsn.com')
>
>     (options, args) = parser.parse_args()
>
>     kafka = KafkaClient('%s:9092' % options.kafkaHost)
>
>     # WARNING: terrible abuse of private methods follows.
>
>     id = kafka._next_id()
>
>     request = KafkaProtocol.encode_metadata_request(kafka.client_id, id)
>     response = kafka._send_broker_unaware_request(id, request)
>
>     (brokers, topics) = KafkaProtocol.decode_metadata_response(response)
>
>     if options.topic != '*':
>         topics_we_want = [options.topic]
>     else:
>         topics_we_want = sorted(topics.keys())
>
>     for topic in topics_we_want:
>         for partition in sorted(topics[topic].keys()):
>             meta = topics[topic][partition]
>             delta = set(meta.replicas) - set(meta.isr)
>             if len(delta) == 0:
>                 print 'topic', topic, 'partition', partition, 'leader',
> meta.leader, 'replicas', meta.replicas, 'isr', meta.isr
>             else:
>                 print 'topic', topic, 'partition', partition, 'leader',
> meta.leader, 'replicas', meta.replicas, 'isr', meta.isr, 'OUT-OF-SYNC',
> delta
>
>     sys.exit(0)
>
> if __name__ == "__main__":
>     #logging.basicConfig(level=logging.DEBUG)
>     main()
> ======
>
> And if I run that against "mytopic", I get:
>
> topic mytopic partition 0 leader 311 replicas (311, 323) isr (311, 323)
> topic mytopic partition 1 leader 323 replicas (323, 312) isr (312, 323)
> topic mytopic partition 2 leader 324 replicas (324, 313) isr (324, 313)
> topic mytopic partition 3 leader 309 replicas (309, 314) isr (314, 309)
> topic mytopic partition 4 leader 315 replicas (310, 315) isr (315,)
> OUT-OF-SYNC set([310])
> topic mytopic partition 5 leader 311 replicas (311, 316) isr (311, 316)
> topic mytopic partition 6 leader 312 replicas (312, 317) isr (317, 312)
> topic mytopic partition 7 leader 318 replicas (313, 318) isr (318, 313)
> topic mytopic partition 8 leader 314 replicas (314, 319) isr (314, 319)
> topic mytopic partition 9 leader 315 replicas (315, 320) isr (320, 315)
> topic mytopic partition 10 leader 316 replicas (316, 321) isr (316, 321)
> topic mytopic partition 11 leader 317 replicas (317, 322) isr (317, 322)
> topic mytopic partition 12 leader 318 replicas (318, 323) isr (318, 323)
> topic mytopic partition 13 leader 324 replicas (319, 324) isr (324,)
> OUT-OF-SYNC set([319])
> topic mytopic partition 14 leader 320 replicas (320, 309) isr (320, 309)
> topic mytopic partition 15 leader 321 replicas (321, 310) isr (321,)
> OUT-OF-SYNC set([310])
> topic mytopic partition 16 leader 312 replicas (312, 320) isr (312, 320)
> topic mytopic partition 17 leader 323 replicas (323, 313) isr (323, 313)
> topic mytopic partition 18 leader 324 replicas (324, 314) isr (314, 324)
> topic mytopic partition 19 leader 309 replicas (309, 315) isr (309, 315)
>
> but if I do:
>
> /opt/kafka/bin/kafka-topics.sh --describe --zookeeper host301:2181 --topic
> mytopic
>
> I get:
>
> Topic:mytopic   PartitionCount:20       ReplicationFactor:2
>  Configs:retention.bytes=100000000000
>         Topic: mytopic  Partition: 0    Leader: 311     Replicas: 311,323
>      Isr: 311,323
>         Topic: mytopic  Partition: 1    Leader: 323     Replicas: 323,312
>      Isr: 312,323
>         Topic: mytopic  Partition: 2    Leader: 324     Replicas: 324,313
>      Isr: 324,313
>         Topic: mytopic  Partition: 3    Leader: 309     Replicas: 309,314
>      Isr: 314,309
>         Topic: mytopic  Partition: 4    Leader: 315     Replicas: 310,315
>      Isr: 315,310
>         Topic: mytopic  Partition: 5    Leader: 311     Replicas: 311,316
>      Isr: 311,316
>         Topic: mytopic  Partition: 6    Leader: 312     Replicas: 312,317
>      Isr: 317,312
>         Topic: mytopic  Partition: 7    Leader: 318     Replicas: 313,318
>      Isr: 318,313
>         Topic: mytopic  Partition: 8    Leader: 314     Replicas: 314,319
>      Isr: 314,319
>         Topic: mytopic  Partition: 9    Leader: 315     Replicas: 315,320
>      Isr: 320,315
>         Topic: mytopic  Partition: 10   Leader: 316     Replicas: 316,321
>      Isr: 316,321
>         Topic: mytopic  Partition: 11   Leader: 317     Replicas: 317,322
>      Isr: 317,322
>         Topic: mytopic  Partition: 12   Leader: 318     Replicas: 318,323
>      Isr: 318,323
>         Topic: mytopic  Partition: 13   Leader: 324     Replicas: 319,324
>      Isr: 324,319
>         Topic: mytopic  Partition: 14   Leader: 320     Replicas: 320,309
>      Isr: 320,309
>         Topic: mytopic  Partition: 15   Leader: 321     Replicas: 321,310
>      Isr: 321,310
>         Topic: mytopic  Partition: 16   Leader: 312     Replicas: 312,320
>      Isr: 312,320
>         Topic: mytopic  Partition: 17   Leader: 323     Replicas: 323,313
>      Isr: 323,313
>         Topic: mytopic  Partition: 18   Leader: 324     Replicas: 324,314
>      Isr: 314,324
>         Topic: mytopic  Partition: 19   Leader: 309     Replicas: 309,315
>      Isr: 309,315
>
> and if I do:
>
> /opt/kafka/bin/kafka-topics.sh --describe --zookeeper host301-ilg1:2181
> --under-replicated-partitions
>
> it prints nothing.
>
> Looking at a system-call trace of kafka-topics.sh, I never see it do a
> MetadataRequest at all: I see it connect to ZK and I see it fishing around
> in there, though.
>
> If I poke around in ZK manually, I see, for example (looking at partition
> 4, since that's one it says is out of sync):
>
> [zk: localhost:2181(CONNECTED) 14] get
> /brokers/topics/mytopic/partitions/4/state
>
> {"controller_epoch":9,"leader":315,"version":1,"leader_epoch":15,"isr":[315,310]}
> cZxid = 0x100000032
> ctime = Fri Oct 31 21:20:31 UTC 2014
> mZxid = 0x44d07e3b0
> mtime = Fri Apr 17 11:44:32 UTC 2015
> pZxid = 0x100000032
> cversion = 0
> dataVersion = 27
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 81
> numChildren = 0
>
> I can do the metadata request against every broker in our cluster, and I
> get the same results, the same three partitions that show as out of sync.
> I can also get that key from ZK on all our ZK instances and I get the same
> basic thing as above.
>
> Looking at the filesystem on 310, the one that has mytopic-4 in it, I see
> log segments being updated there with the current time on them, so
> something's writing there at least -- which doesn't preclude it from being
> a bit behind, I suppose, but it's not like the mod-times are last February.
> (-:
>
> Here's what I see in state-change.log for 'mytopic,4' on broker 315:
>
> state-change.log:[2015-04-16 13:26:42,424] TRACE Broker 315 received
> LeaderAndIsr request
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> correlation id 0 from controller 312 epoch 9 for partition [mytopic,4]
> (state.change.logger)
> state-change.log:[2015-04-16 13:26:42,430] WARN Broker 315 received
> invalid LeaderAndIsr request with correlation id 0 from controller 312
> epoch 9 with an older leader epoch 14 for partition [mytopic,4], current
> leader epoch is 14 (state.change.logger)
> state-change.log:[2015-04-16 13:26:42,702] TRACE Broker 315 cached leader
> info
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 0 (state.change.logger)
> state-change.log:[2015-04-16 13:26:55,541] TRACE Broker 315 cached leader
> info
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 3 (state.change.logger)
> state-change.log:[2015-04-17 11:42:52,215] TRACE Broker 315 received
> LeaderAndIsr request
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> correlation id 503 from controller 312 epoch 9 for partition [mytopic,4]
> (state.change.logger)
> state-change.log:[2015-04-17 11:42:52,215] TRACE Broker 315 handling
> LeaderAndIsr request correlationId 503 from controller 312 epoch 9 starting
> the become-leader transition for partition [mytopic,4] (state.change.logger)
> state-change.log:[2015-04-17 11:42:52,216] TRACE Broker 315 stopped
> fetchers as part of become-leader request from controller 312 epoch 9 with
> correlation id 503 for partition [mytopic,4] (state.change.logger)
> state-change.log:[2015-04-17 11:42:52,216] TRACE Broker 315 completed
> LeaderAndIsr request correlationId 503 from controller 312 epoch 9 for the
> become-leader transition for partition [mytopic,4] (state.change.logger)
> state-change.log:[2015-04-17 11:42:52,217] TRACE Broker 315 cached leader
> info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 503 (state.change.logger)
> state-change.log:[2015-04-17 11:43:06,187] TRACE Broker 315 received
> LeaderAndIsr request
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> correlation id 547 from controller 312 epoch 9 for partition [mytopic,4]
> (state.change.logger)
> state-change.log:[2015-04-17 11:43:06,187] WARN Broker 315 received
> invalid LeaderAndIsr request with correlation id 547 from controller 312
> epoch 9 with an older leader epoch 15 for partition [mytopic,4], current
> leader epoch is 15 (state.change.logger)
> state-change.log:[2015-04-17 11:43:06,212] TRACE Broker 315 cached leader
> info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 547 (state.change.logger)
> state-change.log:[2015-04-17 11:44:30,347] TRACE Broker 315 cached leader
> info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 549 (state.change.logger)
>
> and for 312:
>
> [2015-04-17 11:42:52,207] TRACE Controller 312 epoch 9 started leader
> election for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,210] TRACE Controller 312 epoch 9 elected leader 315
> for Offline partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 changed partition
> [mytopic,4] from OnlinePartition to OnlinePartition with leader 315
> (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> become-follower LeaderAndIsr request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 310 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> become-leader LeaderAndIsr request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 315 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 322 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 313 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 316 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 319 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 310 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 309 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 318 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 312 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 321 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 315 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 324 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 323 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 317 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 311 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Broker 312 cached leader info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 503 (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 320 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:42:52,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 503 to broker 314 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,043] TRACE Controller 312 epoch 9 changed state of
> replica 310 for partition [mytopic,4] from OnlineReplica to OfflineReplica
> (state.change.logger)
> [2015-04-17 11:43:06,186] TRACE Controller 312 epoch 9 sending
> become-leader LeaderAndIsr request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 315 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,188] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 322 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,190] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 313 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,193] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 316 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,195] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 319 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,197] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 309 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,199] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 318 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,201] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 312 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,203] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 321 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,205] TRACE Broker 312 cached leader info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 547 (state.change.logger)
> [2015-04-17 11:43:06,206] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 315 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,208] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 324 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,211] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 323 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,213] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 317 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,215] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 311 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,217] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 320 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:43:06,219] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 547 to broker 314 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,292] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 548 to broker 310 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,310] TRACE Controller 312 epoch 9 changed state of
> replica 310 for partition [mytopic,4] from OfflineReplica to OnlineReplica
> (state.change.logger)
> [2015-04-17 11:44:30,317] TRACE Controller 312 epoch 9 sending
> become-follower LeaderAndIsr request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 310 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,320] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 322 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,322] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 313 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,324] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 316 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,327] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 319 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,329] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 310 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,331] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 309 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,334] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 318 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,336] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 312 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,338] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 321 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,340] TRACE Broker 312 cached leader info
> (LeaderAndIsrInfo:(Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 549 (state.change.logger)
> [2015-04-17 11:44:30,341] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request
> (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9) with correlationId
> 549 to broker 315 for partition [mytopic,4] (state.change.logger)
> [2015-04-17 11:44:30,344] TRACE Controller 312 epoch 9 sending
> UpdateMetadata request (Leader:315,ISR:315,LeaderEpoch:15,ControllerEpoch:9)
>
> and for 310:
>
> [2015-04-16 13:26:42,406] TRACE Broker 310 received LeaderAndIsr request
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> correlation id 0 from controller 312 epoch 9 for partition [mytopic,4]
> (state.change.logger)
> [2015-04-16 13:26:42,410] WARN Broker 310 received invalid LeaderAndIsr
> request with correlation id 0 from controller 312 epoch 9 with an older
> leader epoch 14 for partition [mytopic,4], current leader epoch is 14
> (state.change.logger)
> [2015-04-16 13:26:42,556] TRACE Broker 310 cached leader info
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> controller 312 epoch 9 with correlation id 0 (state.change.logger)
> [2015-04-16 13:26:55,776] TRACE Broker 310 cached leader info
> (LeaderAndIsrInfo:(Leader:310,ISR:310,315,LeaderEpoch:14,ControllerEpoch:8),ReplicationFactor:2),AllReplicas:310,315)
> for partition [mytopic,4] in response to UpdateMetadata request sent by
> control
>
> Which one is right?  Should I not be using MetadataRequests to figure out
> who is and isn't in sync?  If there's something else
>
>         -Steve
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Reply via email to