[ https://issues.apache.org/jira/browse/KAFKA-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177295#comment-17177295 ]
Andras Katona commented on KAFKA-9839: -------------------------------------- This is in 2.6.0 too, but it's not in [release notes of 2.6.0|https://dist.apache.org/repos/dist/release/kafka/2.6.0/RELEASE_NOTES.html]. Commit: https://github.com/apache/kafka/commit/bd17085ec10c767bc82e6b19a3016cf5d50dad92 Shouldn't it be there? > IllegalStateException on metadata update when broker learns about its new > epoch after the controller > ---------------------------------------------------------------------------------------------------- > > Key: KAFKA-9839 > URL: https://issues.apache.org/jira/browse/KAFKA-9839 > Project: Kafka > Issue Type: Bug > Components: controller, core > Affects Versions: 2.2.1, 2.3.1, 2.5.0, 2.4.1 > Reporter: Anna Povzner > Assignee: Anna Povzner > Priority: Critical > Fix For: 2.5.1 > > > Broker throws "java.lang.IllegalStateException: Epoch XXX larger than current > broker epoch YYY" on UPDATE_METADATA when the controller learns about the > broker epoch and sends UPDATE_METADATA before KafkaZkCLient.registerBroker > completes (the broker learns about its new epoch). > Here is the scenario we observed in more detail: > 1. ZK session expires on broker 1 > 2. Broker 1 establishes new session to ZK and creates znode > 3. Controller learns about broker 1 and assigns epoch > 4. Broker 1 receives UPDATE_METADATA from controller, but it does not know > about its new epoch yet, so we get an exception: > ERROR [KafkaApi-3] Error when handling request: clientId=1, correlationId=0, > api=UPDATE_METADATA, body={ > ......... > java.lang.IllegalStateException: Epoch XXX larger than current broker epoch > YYY at kafka.server.KafkaApis.isBrokerEpochStale(KafkaApis.scala:2725) at > kafka.server.KafkaApis.handleUpdateMetadataRequest(KafkaApis.scala:320) at > kafka.server.KafkaApis.handle(KafkaApis.scala:139) at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at > java.lang.Thread.run(Thread.java:748) > 5. KafkaZkCLient.registerBroker completes on broker 1: "INFO Stat of the > created znode at /brokers/ids/1" > The result is the broker has a stale metadata for some time. > Possible solutions: > 1. Broker returns a more specific error and controller retries UPDATE_MEDATA > 2. Broker accepts UPDATE_METADATA with larger broker epoch. -- This message was sent by Atlassian Jira (v8.3.4#803005)