Re: Metadata requests for subset of partitions

Stanislav Kozlovski Fri, 28 Feb 2025 11:48:42 -0800

> > It's certainly been a topic that's come up before. In certain situations
> > the current approach is a bit heavy-handed. The current approach for
> > fetching metadata has a number of benefits: it keeps the protocol from
> > being too chatty, which reduces load on the brokers and makes maintaining a
> > consistent via of the metadata on the client much easier. There's a fairly
> > substantial overhead with fetching metadata and batching it in a single
> > request eliminates a lot of edge cases.

My understanding is that the substantial overhead of the metadata request comes 
precisely from the total number of partitions the broker needs to iterate over 
and build objects for. (please correct me if I'm wrong and it's something 
non-obvious)

If that's true, then the less partitions it has to do that for - the less 
overhead there would be?

As for the edge cases, I am not aware of them but can certainly imagine 
something like the old consumer protocol where the client chooses assignment be 
prone to edge cases from incomplete metadata. Perhaps the subset partition 
metadata fetching can be employed strategically in cases where that risk is 
lower.

--

Michal, out of curiosity, what lead you to this question? Do you see any 
substantial overhead in the metadata path on the clients/brokers because of 
this unnecessary fetching?

--

re: chattiness - do we all define chattiness by the number of requests per 
second?
Michal, you mention fetching the subset could reduce chattiness but I don't see 
how that could happen. By definition if you send less data per response, then 
the chances are you'll need more to send more requests once you want more data. 
Am I missing anything?

On 2025/02/28 07:56:29 Michał Łowicki wrote:
> On Thu, Feb 27, 2025 at 5:39 PM Kirk True <k...@kirktrue.pro> wrote:
> 
> > Hi Michał,
> >
> > On Thu, Feb 27, 2025, at 3:44 AM, Michał Łowicki wrote:
> > > Hi there!
> > >
> > > Is there any reason why Metadata requests
> > > <https://kafka.apache.org/protocol.html#The_Messages_Metadata> do not
> > > support fetching metadata for subsets of the partitions? If a certain
> > > client is interested only in e.g. 1 but topic may have many so most of
> > > fetched data isn't really used.
> > >
> >
> > It's certainly been a topic that's come up before. In certain situations
> > the current approach is a bit heavy-handed. The current approach for
> > fetching metadata has a number of benefits: it keeps the protocol from
> > being too chatty, which reduces load on the brokers and makes maintaining a
> > consistent via of the metadata on the client much easier. There's a fairly
> > substantial overhead with fetching metadata and batching it in a single
> > request eliminates a lot of edge cases.
> >
> 
> Sure, I'm rather thinking about an opt-in option to the protocol where, if
> specified, metadata response would contain metadata for a specified set of
> partitions (otherwise as of today metadata for all of them). To cover the
> cases where consumers need to know metadata for only a small portion of
> partitions. Then it would be less for the broker to handle such requests
> and craft responses and protocol would be actually less chatty in those
> cases.
> 
> 
> >
> > As always, further discussion and suggestions for improvements in this
> > area are welcomed :)
> >
> > Thanks,
> > Kirk
>

Re: Metadata requests for subset of partitions

Reply via email to