Thanks everyone. With 3 binding votes from Guozhang, Jason, and Rajini, the
vote has passed.
Brian
On Tue, Jan 28, 2020 at 11:45 AM Ted Yu wrote:
> +1
>
> On Tue, Jan 28, 2020 at 10:52 AM Rajini Sivaram
> wrote:
>
> > +1 (binding)
> >
> > Thanks for the KIP, Brian!
> >
> > Regards,
> >
> > Ra
+1
On Tue, Jan 28, 2020 at 10:52 AM Rajini Sivaram
wrote:
> +1 (binding)
>
> Thanks for the KIP, Brian!
>
> Regards,
>
> Rajini
>
> On Thu, Jan 23, 2020 at 7:34 PM Jason Gustafson
> wrote:
>
> > Sounds good. +1 from me.
> >
> > On Thu, Jan 23, 2020 at 9:00 AM Brian Byrne wrote:
> >
> > > Thank
+1 (binding)
Thanks for the KIP, Brian!
Regards,
Rajini
On Thu, Jan 23, 2020 at 7:34 PM Jason Gustafson wrote:
> Sounds good. +1 from me.
>
> On Thu, Jan 23, 2020 at 9:00 AM Brian Byrne wrote:
>
> > Thanks Jason,
> >
> > I'm in favor of the latter: metadata.max.idle.ms. I agree that
> descri
Sounds good. +1 from me.
On Thu, Jan 23, 2020 at 9:00 AM Brian Byrne wrote:
> Thanks Jason,
>
> I'm in favor of the latter: metadata.max.idle.ms. I agree that describing
> it as a "period" is inaccurate. With metadata.max.idle.ms, it also aligns
> with metadata.max.age.ms for determining refresh
Thanks Jason,
I'm in favor of the latter: metadata.max.idle.ms. I agree that describing
it as a "period" is inaccurate. With metadata.max.idle.ms, it also aligns
with metadata.max.age.ms for determining refresh period (which is an actual
period).
I've updated the docs.
Thanks,
Brian
On Wed, Jan
Thanks for the proposal. Looks good overall. I wanted to suggest a possible
name change. I was considering something like `idle.metadata.expiration.ms`
or maybe `metadata.max.idle.ms`. Thoughts?
-Jason
On Tue, Jan 21, 2020 at 11:38 AM Guozhang Wang wrote:
> Got it.
>
> I was proposing that we
Got it.
I was proposing that we do the "delayed async batch" but I think your
argument for complexity and pushing it out of the scope is convincing, so
instead I propose we do the synchronous mini batching still but obviously
it is already there :) I'm +1 on the current proposal scope.
Guozhang
Hi Guozhang,
Ah, sorry, I misunderstood. Actually, this is solved for us today. How the
producer works is that it maintains at most one inflight metadata fetch
request at any time, where each request is tagged with the current
(monotonically increasing) request version. This version is bumped when
Hi Brian,
I think I buy the complexity and extra end-to-end-latency argument :) I'm
fine with delaying the asynchronous tech fetching to future works and keep
the current KIP's scope as-is for now. Under that case can we consider just
a minor implementation detail (since it is not affecting public
Hi Guozhang,
Your understanding of the rationale is accurate, and what you suggest is
completely plausible, however I have a slightly different take on the
situation.
When the KIP was originally drafted, making KafkaProducer#send asynchronous
was one element of the changes (this is a little more
Hello Brian,
I looked at the new proposal again, and I'd like to reason about its
rationale from the listed motivations in your wiki:
1) more RPCs: we may send metadata requests more frequently than
appropriate. This is especially the case during producer start-up, where
the more topics it needs
Hello all,
After further offline discussion, I've removed any efforts to control
metadata RPC sizes. There are now only two items proposed in this KIP:
(1) When encountering a new topic, only issue a metadata request for that
particular topic. For all other cases, continue as it does today with a
On Mon, Jan 6, 2020, at 14:40, Brian Byrne wrote:
> So the performance of a metadata RPC that occurs every once every 10
> seconds should not be measured strictly in CPU cost, but rather the effect
> on the 95-99%. The larger the request is, the more opportunity there is to
> put a burst stress on
Hello all,
Does anyone else have opinions on the issue of RPC frequency? Would it be
better to remove the fetching of non-urgent topics altogether, so that the
refreshes are contained in a larger batch?
Thanks,
Brian
On Mon, Jan 6, 2020 at 2:40 PM Brian Byrne wrote:
>
> So the performance of
So the performance of a metadata RPC that occurs every once every 10
seconds should not be measured strictly in CPU cost, but rather the effect
on the 95-99%. The larger the request is, the more opportunity there is to
put a burst stress on the producer and broker, and the larger the response
paylo
On Mon, Jan 6, 2020, at 13:07, Brian Byrne wrote:
> Hi Colin,
>
> Thanks again for the feedback!
>
> On Mon, Jan 6, 2020 at 12:07 PM Colin McCabe wrote:
>
> > Metadata requests don't (always) go to the controller, right? We should
> > fix the wording in this section.
> >
>
> You're correct, s
Hi Colin,
Thanks again for the feedback!
On Mon, Jan 6, 2020 at 12:07 PM Colin McCabe wrote:
> Metadata requests don't (always) go to the controller, right? We should
> fix the wording in this section.
>
You're correct, s/controller/broker(s)/.
I feel like "Proposed Changes" should come befo
Hi Brian,
Thanks for continuing to work on this. It looks good overall.
It's probably better to keep the discussion on this thread rather than going
back and forth between this and the DISCUSS thread.
The KIP states:
> For (1), an RPC is generated every time an uncached topic's metadata must
+1 (non-binding)
On Thu, Jan 2, 2020 at 11:15 AM Brian Byrne wrote:
> Hello all,
>
> After further discussion and improvements, I'd like to reinstate the voting
> process.
>
> The updated KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-526
> %3A+Reduce+Producer+Metadata+Lookups+for+La
+1 (non-binding)
Thanks for the KIP, Brian!
On Thu, Jan 2, 2020 at 7:15 PM Brian Byrne wrote:
> Hello all,
>
> After further discussion and improvements, I'd like to reinstate the voting
> process.
>
> The updated KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-526
> %3A+Reduce+Produ
Hello all,
After further discussion and improvements, I'd like to reinstate the voting
process.
The updated KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-526
%3A+Reduce+Producer+Metadata+Lookups+for+Large+Number+of+Topics
The continued discussion:
https://lists.apache.org/thread.htm
With the concluded summary on the other discussion thread, I'm +1 on the
proposal.
Thanks Brian!
On Tue, Nov 19, 2019 at 8:00 PM deng ziming
wrote:
> >
> > For new (uncached) topics, one problem here is that we don't know which
> > partition to map a record to in the event that it has a key or
>
> For new (uncached) topics, one problem here is that we don't know which
> partition to map a record to in the event that it has a key or custom
> partitioner, so the RecordAccumulator wouldn't know which batch/broker it
> belongs. We'd need an intermediate record queue that subsequently moved t
Hi Deng,
Thanks for the feedback.
On Mon, Nov 18, 2019 at 6:56 PM deng ziming
wrote:
> hi, I reviewed the current code, the ProduceMetadata maintains an expiry
> threshold for every topic, every time when we write to a topic we will set
> the expiry time to -1 to indicate it should be updated,
hi, I reviewed the current code, the ProduceMetadata maintains an expiry
threshold for every topic, every time when we write to a topic we will set
the expiry time to -1 to indicate it should be updated, this does work to
reduce the size of the topic working set, but the producer will continue
fetc
On Mon, Nov 18, 2019, at 10:05, Brian Byrne wrote:
> On Fri, Nov 15, 2019 at 5:08 PM Colin McCabe wrote:
>
> > Two seconds doesn't seem like a reasonable amount of time to leave for the
> > metadata fetch. Fetching halfway through the expiration period seems more
> > reasonable. It also doesn't
On Fri, Nov 15, 2019 at 5:08 PM Colin McCabe wrote:
> Two seconds doesn't seem like a reasonable amount of time to leave for the
> metadata fetch. Fetching halfway through the expiration period seems more
> reasonable. It also doesn't require us to create a new configuration key,
> which is nic
On Tue, Nov 12, 2019, at 10:27, Brian Byrne wrote:
> Hi Colin,
>
> Thanks for the feedback. I'm going to leave out some specifics in my
> response, since I'll go back to the KIP, revise it, and then post an update
> on the original discussion thread. I'll make two primary changes, (1)
> remove ref
On Tue, Nov 12, 2019, at 17:42, Guozhang Wang wrote:
> Sounds good to me, let's continue our voting process here.
>
> Guozhang
Hi Brian & Guozhang,
I'm looking forward to getting this in. But is the KIP complete yet? Brian's
earlier response said " I'm going to leave out some specifics in my
Sounds good to me, let's continue our voting process here.
Guozhang
On Tue, Nov 12, 2019 at 12:10 PM Ismael Juma wrote:
> This is not a bug fix, in my opinion. The existing behavior may be
> confusing, but it was documented, so I assume it was intended.
>
> Ismael
>
> On Mon, Nov 11, 2019, 2:47
Hi Colin,
Thanks for the feedback. I'm going to leave out some specifics in my
response, since I'll go back to the KIP, revise it, and then post an update
on the original discussion thread. I'll make two primary changes, (1)
remove reference to expiry not being calculated appropriately, since this
This is not a bug fix, in my opinion. The existing behavior may be
confusing, but it was documented, so I assume it was intended.
Ismael
On Mon, Nov 11, 2019, 2:47 PM Guozhang Wang wrote:
> Thanks for the update Brian, I think I agree with Colin that we should
> clarify on whether / how the blo
Thanks for the update Brian, I think I agree with Colin that we should
clarify on whether / how the blocking behavior due to metadata fetch would
be affected or not.
About whether this needs to be voted as a KIP: to me the behavior change is
really a bug fix rather than a public contract change, s
Hi Brian,
Thanks for the KIP.
Starting the metadata fetch before we need the result is definitely a great
idea. This way, the metadata fetch can be done in parallel with all the other
stuff e producer is doing, rather than forcing the producer to periodically
come to a halt periodically while
I think this KIP affects when we block which is actually user visible
behavior. Right?
Ismael
On Fri, Nov 8, 2019, 8:50 AM Brian Byrne wrote:
> Hi Guozhang,
>
> Regarding metadata expiry, no access times other than the initial lookup[1]
> are used for determining when to expire producer metadat
Hi Guozhang,
Regarding metadata expiry, no access times other than the initial lookup[1]
are used for determining when to expire producer metadata. Therefore,
frequently used topics' metadata will be aged out and subsequently
refreshed (in a blocking manner) every five minutes, and infrequently us
Hello Brian,
Could you elaborate a bit more on this sentence: "This logic can be made
more intelligent by managing the expiry from when the topic was last used,
enabling the expiry duration to be reduced to improve cases where a large
number of topics are touched intermittently." Not sure I fully
Hello all,
I'd like to propose a vote for a producer change to improve producer
behavior when dealing with a large number of topics, in part by reducing
the amount of metadata fetching performed.
The full KIP is provided here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-526%3A+Reduce+Pr
38 matches
Mail list logo