[
https://issues.apache.org/jira/browse/KAFKA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson resolved KAFKA-6916.
------------------------------------
Resolution: Fixed
> AdminClient does not refresh metadata on broker failure
> -------------------------------------------------------
>
> Key: KAFKA-6916
> URL: https://issues.apache.org/jira/browse/KAFKA-6916
> Project: Kafka
> Issue Type: Task
> Components: admin
> Affects Versions: 1.1.0, 1.0.1
> Reporter: Rajini Sivaram
> Assignee: Rajini Sivaram
> Priority: Major
> Fix For: 2.0.0
>
>
> There are intermittent test failures in DynamicBrokerReconfigurationTest when
> brokers are restarted. The test uses ephemeral ports and hence ports after
> server restart are not the same as the ports before restart. The tests rely
> on metadata refresh on producers, consumers and admin clients to obtain new
> server ports when connections fail. This works with producers and consumers,
> but results in intermittent failures with admin client because refresh is not
> triggered.
> There are a couple of issues in AdminClient:
> # Unlike producers and consumers, adminClient does not request metadata
> update when connection to a broker fails. This is particularly bad if
> controller goes down. Controller is used for various requests like
> createTopics and describeTopics. If controller goes down and
> adminClient.describeTopics() is invoked, adminClient sends the request to the
> old controller. If the connection fails, it keeps retrying with the same
> address. Metadata refresh is never triggered. The request times out after 2
> minutes by default, metadata is not refreshed for 5 minutes by default. We
> should refresh metadata whenever connection to a broker fails.
> # Admin client requests are always retried on the same node. In the example
> above, if controller goes down and a new controller is elected, it will be
> good if the retried request is sent to the new controller. Otherwise we are
> just blocking the call for 2 minutes with a lot of retries that would never
> succeed.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)