Hello Brain, For the newly added `metadata.evict.ms` config, since its default value is set as the same as `metadata.max.age.ms` and most users would not override default config values, would it be possible for scenarios that producer send to a specific topic once and never send again, we may end up refreshing the metadata that are about to be removed soon? In other words I'm wondering if for common cases the eviction period should be larger than max.age or vice versa. This is my mental model:
1) If we evict a topic eagerly compared with refreshing it, e.g. after time T1 has elapsed, and then after T2 (> T1) elapsed that topic is requested for sending again, we would force a metadata update which includes this topic (remember we now always batch a set of topics in each metadata request) 2) if we evict a topic more lazily compared with refreshing it, say after T2 has elapsed, but before that after T1 (< T2) we have to refresh it already, it is included in that metadata request. So from the number of metadata requests it may incur, evicting more eagerly seems okay --- i.e. even if we do not evict this topic, it would still cause a metadata request after the max.age.ms anyways so evicting it early does not make things worse. More generally, if we know a topic's sending frequency is P1 (i.e. every P1 seconds we send once for that topic), and eviction period is P2, and the metadata.max.age is P3. Then the optimal scenario in terms of metadata update frequency seems to be P1 = P2 < P3. So I feel making the default value for metadata.evict.ms smaller is fine, e.g. say 1minute or 30seconds. WDYT? Guozhang On Thu, Dec 26, 2019 at 7:08 AM Brian Byrne <bby...@confluent.io> wrote: > Hi Stanislav, > > Appreciate the feedback! > > 1. You're correct. I've added notes to the KIP to clarify. > > 2. Yes it should. Fixed. > > 3. So I made a mistake when generalizing the target refresh size, which > should have been using `metadata.max.age.ms` instead of `metadata.evict.ms > `. > Therefore, `metadata.max.age.ms` still controls the target refresh time of > active topic metadata, and doesn't change in that regard. You're right in > that the U set will make metadata refreshing more proactive in some cases, > which is done to smooth out the metadata RPCs over time and lessen the > potential bulk of some of the RPC responses by working in smaller units. > The T set is what logically exists today and doesn't change. > > Thanks, > Brian > > > > > > On Thu, Dec 26, 2019 at 2:16 AM Stanislav Kozlovski < > stanis...@confluent.io> > wrote: > > > Hey Brian, > > > > 1. Could we more explicitly clarify the behavior of the algorithm when > `|T| > > > TARGET_METADATA_FETCH SIZE` ? I assume we ignore the config in that > > scenario > > 2. Should `targetMetadataFetchSize = Math.max(topicsPerSec / 10, 20)` be > > `topicsPerSec * 10` ? > > 3. When is this new algorithm applied? To confirm my understanding - what > > is the behavior of `metadata.max.age.ms` after this KIP? Are we adding a > > new, more proactive metadata fetch for topics in |U|? > > > > Thanks, > > Stanislav > > > > On Thu, Dec 19, 2019 at 11:37 PM Brian Byrne <bby...@confluent.io> > wrote: > > > > > Hello everyone, > > > > > > For all interested, please take a look at the proposed algorithm as I'd > > > like to get more feedback. I'll call for a vote once the break is over. > > > > > > Thanks, > > > Brian > > > > > > On Mon, Dec 9, 2019 at 10:18 PM Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > Sounds good, I agree that should not make a big difference in > practice. > > > > > > > > On Mon, Dec 9, 2019 at 2:07 PM Brian Byrne <bby...@confluent.io> > > wrote: > > > > > > > > > Hi Guozhang, > > > > > > > > > > I see, we agree on the topic threshold not applying to urgent > topics, > > > but > > > > > differ slightly on what should be considered urgent. I would argue > > that > > > > we > > > > > should consider topics nearing the metadata.max.age.ms to be > urgent > > > > since > > > > > they may still be well within the metadata.expiry.ms. That is, the > > > > client > > > > > still considers these topics to be relevant (not expired), but > > doesn't > > > > want > > > > > to incur the latency bubble of having to wait for the metadata to > be > > > > > re-fetched if it's stale. This could be a frequent case if the > > > > > metadata.max.age.ms << metadata.expiry.ms. > > > > > > > > > > In practice, I wouldn't expect this to make a noticeable difference > > so > > > I > > > > > don't have a strong leaning, but the current behavior today is to > > > > > aggressively refresh the metadata of stale topics by ensuring a > > refresh > > > > is > > > > > triggered before that metadata.max.age.ms duration elapses. > > > > > > > > > > Thanks, > > > > > Brian > > > > > > > > > > > > > > > On Mon, Dec 9, 2019 at 11:57 AM Guozhang Wang <wangg...@gmail.com> > > > > wrote: > > > > > > > > > > > Hello Brian, > > > > > > > > > > > > Thanks for your explanation, could you then update the wiki page > > for > > > > the > > > > > > algorithm part since when I read it, I thought it was different > > from > > > > the > > > > > > above, e.g. urgent topics should not be added just because of > > max.age > > > > > > expiration, but should only be added if there are sending data > > > pending. > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > On Mon, Dec 9, 2019 at 10:57 AM Brian Byrne <bby...@confluent.io > > > > > > wrote: > > > > > > > > > > > > > Hi Guozhang, > > > > > > > > > > > > > > Thanks for the feedback! > > > > > > > > > > > > > > On Sun, Dec 8, 2019 at 6:25 PM Guozhang Wang < > wangg...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > 1. The addition of *metadata.expiry.ms < > > > http://metadata.expiry.ms> > > > > > > > *should > > > > > > > > be included in the public interface. Also its semantics needs > > > more > > > > > > > > clarification (since previously it is hard-coded internally > we > > do > > > > not > > > > > > > need > > > > > > > > to explain it publicly, but now with the configurable value > we > > do > > > > > > need). > > > > > > > > > > > > > > > > > > > > > > This was an oversight. Done. > > > > > > > > > > > > > > > > > > > > > > 2. There are a couple of hard-coded parameters like 25 and > 0.5 > > in > > > > the > > > > > > > > proposal, maybe we need to explain why these magic values > makes > > > > sense > > > > > > in > > > > > > > > common scenarios. > > > > > > > > > > > > > > > > > > > > > > So these are pretty fuzzy numbers, and seemed to be a decent > > > balance > > > > > > > between trade-offs. I've updated the target size to account for > > > > setups > > > > > > with > > > > > > > a large number of topics or a shorter refresh time, as well as > > > added > > > > > some > > > > > > > light rationale. > > > > > > > > > > > > > > > > > > > > > > 3. In the Urgent set condition, do you actually mean "with no > > > > cached > > > > > > > > metadata AND there are existing data buffered for the topic"? > > > > > > > > > > > > > > > > > > > > > > Yes, fixed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > One concern I have is whether or not we may introduce a > > > regression, > > > > > > > > especially during producer startup such that since we only > > > require > > > > up > > > > > > to > > > > > > > 25 > > > > > > > > topics each request, it may cause the send data to be > buffered > > > more > > > > > > time > > > > > > > > than now due to metadata not available. I understand this is > a > > > > > > > acknowledged > > > > > > > > trade-off in our design but any regression that may surface > to > > > > users > > > > > > need > > > > > > > > to be very carefully considered. I'm wondering, e.g. if we > can > > > > tweak > > > > > > our > > > > > > > > algorithm for the Urgent set, e.g. to consider those with non > > > > cached > > > > > > > > metadata have higher priority than those who have elapsed > > max.age > > > > but > > > > > > not > > > > > > > > yet have been called for sending. More specifically: > > > > > > > > > > > > > > > > Urgent: topics that have been requested for sending but no > > cached > > > > > > > metadata, > > > > > > > > and topics that have send request failed with e.g. > NOT_LEADER. > > > > > > > > Non-urgent: topics that are not in Urgent but have expired > > > max.age. > > > > > > > > > > > > > > > > Then when sending metadata, we always send ALL in the urgent > > > (i.e. > > > > > > ignore > > > > > > > > the size limit), and only when they do not exceed the size > > limit, > > > > > > > consider > > > > > > > > fill in more topics from Non-urgent up to the size limit. > > > > > > > > > > > > > > > > > > > > > > I think we're on the same page here. Urgent topics ignore the > > > target > > > > > > > metadata RPC size, and are not bound by it in any way, i.e. if > > > > there's > > > > > > 100 > > > > > > > urgent topics, we'll fetch all 100 in a single RPC. Like you > say, > > > > > > however, > > > > > > > if a topic transitions to urgent and there's several non-urgent > > > ones, > > > > > > we'll > > > > > > > piggyback the non-urgent updates up to the target size. > > > > > > > > > > > > > > Thanks, > > > > > > > Brian > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 20, 2019 at 7:00 PM deng ziming < > > > > dengziming1...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I think it's ok, and you can add another issue about > > > > `asynchronous > > > > > > > > > metadata` if `topic expiry` is not enough. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 6:20 AM Brian Byrne < > > > bby...@confluent.io > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello all, > > > > > > > > > > > > > > > > > > > > I've refactored the KIP to remove implementing > asynchronous > > > > > > metadata > > > > > > > > > > fetching in the producer during send(). It's now > > exclusively > > > > > > focused > > > > > > > on > > > > > > > > > > reducing the topic metadata fetch payload and proposes > > > adding a > > > > > new > > > > > > > > > > configuration flag to control topic expiry behavior. > Please > > > > take > > > > > a > > > > > > > look > > > > > > > > > > when possible. > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-526 > > > > > > > > > > > > > %3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Brian > > > > > > > > > > > > > > > > > > > > On Fri, Oct 4, 2019 at 10:04 AM Brian Byrne < > > > > bby...@confluent.io > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Lucas, Guozhang, > > > > > > > > > > > > > > > > > > > > > > Thank you for the comments. Good point on > > > > > > METADATA_MAX_AGE_CONFIG - > > > > > > > > it > > > > > > > > > > > looks like the ProducerMetadata was differentiating > > between > > > > > > expiry > > > > > > > > and > > > > > > > > > > > refresh, but it should be unnecessary to do so once the > > > cost > > > > of > > > > > > > > > fetching > > > > > > > > > > a > > > > > > > > > > > single topic's metadata is reduced. > > > > > > > > > > > > > > > > > > > > > > I've updated the rejected alternatives and removed the > > > config > > > > > > > > > variables. > > > > > > > > > > > > > > > > > > > > > > Brian > > > > > > > > > > > > > > > > > > > > > > On Fri, Oct 4, 2019 at 9:20 AM Guozhang Wang < > > > > > wangg...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > >> Hello Brian, > > > > > > > > > > >> > > > > > > > > > > >> Thanks for the KIP. > > > > > > > > > > >> > > > > > > > > > > >> I think using asynchronous metadata update to address > 1) > > > > > > metadata > > > > > > > > > update > > > > > > > > > > >> blocking send, but for other issues, currently at > > producer > > > > we > > > > > do > > > > > > > > have > > > > > > > > > a > > > > > > > > > > >> configurable `METADATA_MAX_AGE_CONFIG` similar to > > > consumer, > > > > by > > > > > > > > default > > > > > > > > > > is > > > > > > > > > > >> 5min. So maybe we do not need to introduce new configs > > > here, > > > > > but > > > > > > > > only > > > > > > > > > > >> change the semantics of that config from global expiry > > > > (today > > > > > we > > > > > > > > just > > > > > > > > > > >> enforce a full metadata update for the whole cluster) > to > > > > > > > > single-topic > > > > > > > > > > >> expiry, and we can also extend its expiry deadline > > > whenever > > > > > that > > > > > > > > > > metadata > > > > > > > > > > >> is successfully used to send a produce request. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Guozhang > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> On Thu, Oct 3, 2019 at 6:51 PM Lucas Bradstreet < > > > > > > > lu...@confluent.io > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > >> > > > > > > > > > > >> > Hi Brian, > > > > > > > > > > >> > > > > > > > > > > > >> > This looks great, and should help reduce blocking > and > > > high > > > > > > > > metadata > > > > > > > > > > >> request > > > > > > > > > > >> > volumes when the producer is sending to large > numbers > > of > > > > > > topics, > > > > > > > > > > >> especially > > > > > > > > > > >> > at low volumes. I think the approach to make > metadata > > > > > fetching > > > > > > > > > > >> asynchronous > > > > > > > > > > >> > and batch metadata requests together will help > > > > > significantly. > > > > > > > > > > >> > > > > > > > > > > > >> > The only other approach I can think of is to allow > > users > > > > to > > > > > > > supply > > > > > > > > > the > > > > > > > > > > >> > producer with the expected topics upfront, allowing > > the > > > > > > producer > > > > > > > > to > > > > > > > > > > >> perform > > > > > > > > > > >> > a single initial metadata request before any sends > > > occur. > > > > I > > > > > > see > > > > > > > no > > > > > > > > > > real > > > > > > > > > > >> > advantages to this approach compared to the async > > method > > > > > > you’ve > > > > > > > > > > >> proposed, > > > > > > > > > > >> > but maybe we could add it to the rejected > alternatives > > > > > > section? > > > > > > > > > > >> > > > > > > > > > > > >> > Thanks, > > > > > > > > > > >> > > > > > > > > > > > >> > Lucas > > > > > > > > > > >> > > > > > > > > > > > >> > On Fri, 20 Sep 2019 at 11:46, Brian Byrne < > > > > > > bby...@confluent.io> > > > > > > > > > > wrote: > > > > > > > > > > >> > > > > > > > > > > > >> > > I've updated the 'Proposed Changes' to include two > > new > > > > > > > producer > > > > > > > > > > >> > > configuration variables: topic.expiry.ms and > > > > > > topic.refresh.ms > > > > > > > . > > > > > > > > > > Please > > > > > > > > > > >> > take > > > > > > > > > > >> > > a look. > > > > > > > > > > >> > > > > > > > > > > > > >> > > Thanks, > > > > > > > > > > >> > > Brian > > > > > > > > > > >> > > > > > > > > > > > > >> > > On Tue, Sep 17, 2019 at 12:59 PM Brian Byrne < > > > > > > > > bby...@confluent.io > > > > > > > > > > > > > > > > > > > > >> > wrote: > > > > > > > > > > >> > > > > > > > > > > > > >> > > > Dev team, > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > Requesting discussion for improvement to the > > > producer > > > > > when > > > > > > > > > dealing > > > > > > > > > > >> > with a > > > > > > > > > > >> > > > large number of topics. > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > KIP: > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-526%3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > JIRA: > > > > https://issues.apache.org/jira/browse/KAFKA-8904 > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > Thoughts and feedback would be appreciated. > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > Thanks, > > > > > > > > > > >> > > > Brian > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> -- > > > > > > > > > > >> -- Guozhang > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > > > > -- > > Best, > > Stanislav > > > -- -- Guozhang