[
https://issues.apache.org/jira/browse/KAFKA-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brian Byrne reassigned KAFKA-8904:
----------------------------------
Assignee: Brian Byrne
> Reduce metadata lookups when producing to a large number of topics
> ------------------------------------------------------------------
>
> Key: KAFKA-8904
> URL: https://issues.apache.org/jira/browse/KAFKA-8904
> Project: Kafka
> Issue Type: Improvement
> Components: controller, producer
> Reporter: Brian Byrne
> Assignee: Brian Byrne
> Priority: Minor
>
> Per [~lbradstreet]:
>
> "The problem was that the producer starts with no knowledge of topic
> metadata. So they start the producer up, and then they start sending messages
> to any of the thousands of topics that exist. Each time a message is sent to
> a new topic, it'll trigger a metadata request if the producer doesn't know
> about it. These metadata requests are done in serial such that if you send
> 2000 messages to 2000 topics, it will trigger 2000 new metadata requests.
>
> Each successive metadata request will include every topic seen so far, so the
> first metadata request will include 1 topic, the second will include 2
> topics, etc.
>
> An additional problem is that this can take a while, and metadata expiry (for
> metadata that has not been recently used) is hard coded to 5 mins, so if this
> the initial fetches take long enough you can end up evicting the metadata
> before you send another message to a topic.
> So the approaches above are:
> 1. We can linger for a bit before making a metadata request, allow more sends
> to go through, and then batch the metadata request for topics we we need in a
> single metadata request.
> 2. We can allow pre-seeding the producer with metadata for a list of topics
> you care about.
> I prefer 1 if we can make it work."
--
This message was sent by Atlassian Jira
(v8.3.4#803005)