Hi Mickael, Thank you for your feedback. 1. I can see that, I will get that changed. 2. Done
Thanks, Brenden On Wed, Jul 24, 2024 at 5:06 AM Mickael Maison <mickael.mai...@gmail.com> wrote: > Hi, > > 1. The title is a bit misleading. It's proposing to add new metrics, > JMX is just one of the mechanisms to export them. > 2. +1 to not register the new metrics when using the classic consumer, > instead of setting them to 0. Similarly I assume existing metrics that > don't apply to the new consumers are not registered? > 3. At the moment this KIP is not changing any public APIs. What's the > plan to make AsyncKafkaConsumer public? > > Thanks, > Mickael > > > > On Tue, Jul 23, 2024 at 6:03 PM Bruno Cadonna <cado...@apache.org> wrote: > > > > Hi Brenden, > > > > BC1. In his first e-mail Andrew wrote "I would expect that the metrics > > do not exist at all". I agree with him. I think it would be better to > > not add those metrics at all if the CLASSIC protocol is used rather than > > the metrics exist and are all constant 0. This should be possible by not > > adding the metrics to the metrics registry if the CONSUMER protocol is > > not used. > > > > BC2. Is there a specific reason you do not propose > > background-event-queue-time-max and background-event-queue-time-avg? If > > folk think those are not useful we do not need to add them. However, if > > those are not useful, why is background-event-queue-size useful. I was > > just wondering about the asymmetry between background-event-queue and > > application-event-queue. > > > > Best, > > Bruno > > > > > > > > On 7/19/24 9:14 PM, Brenden Deluna wrote: > > > Hi Apoorv, > > > Thank you for your comments, I will address each. > > > > > > AM1. I can see the usefulness in also having an > > > 'application-event-queue-age-max' to get an idea of outliers and how > they > > > may be affecting the average metric. I will add that. > > > > > > AM2. I agree with you there, I think 'time' is a better descriptor here > > > than 'age'. I will update those metric names as well. > > > > > > AM3. Similar to above comments, I will change the name of that metric > to be > > > more consistent. And I think a max metric would also be useful here, > adding > > > that. > > > > > > AM4. Yes, good catch there. Will update that as well. > > > > > > Thank you, > > > Brenden > > > > > > On Fri, Jul 19, 2024 at 8:14 AM Apoorv Mittal < > apoorvmitta...@gmail.com> > > > wrote: > > > > > >> Hi Brendan, > > >> Thanks for the KIP. The metrics are always helpful. > > >> > > >> AM1: Is `application-event-queue-age-avg` enough or do we require ` > > >> application-event-queue-age-max` as well to differentiate with > outliers? > > >> > > >> AM2: The kafka producer defines metric `record-queue-time-avg` which > > >> captures the time spent in the buffer. Do you think we should have a > > >> similar name for `application-event-queue-age-avg` i.e. change to ` > > >> application-event-queue-time-avg`? Moreover other than similar naming, > > >> `time` anyways seems more suitable than `age`, though minor. The > `time` > > >> usage is also aligned with the description of this metric. > > >> > > >> AM3: Metric `application-event-processing-time` says "the average > time, > > >> that the consumer network.....". Shall we have the `-avg` suffix in > the > > >> metric as we have defined for other metrics? Also do we require the > max > > >> metric as well for the same? > > >> > > >> AM4: Is the telemetry name for `unsent-requests-queue-size` intended > > >> as `org.apache.kafka.consumer.unsent.requests.size`, > > >> or it should be corrected to ` > > >> org.apache.kafka.consumer.unsent.requests.queue.size`? > > >> > > >> AM2: > > >> Regards, > > >> Apoorv Mittal > > >> +44 7721681581 > > >> > > >> > > >> On Mon, Jul 15, 2024 at 2:45 PM Andrew Schofield < > > >> andrew_schofi...@live.com> > > >> wrote: > > >> > > >>> Hi Brenden, > > >>> Thanks for the updates. > > >>> > > >>> AS4. I see that you’ve added `.ms` to a bunch of the metrics > reflecting > > >> the > > >>> fact that they’re measured in milliseconds. However, I observe that > most > > >>> metrics > > >>> in Kafka that are measured in milliseconds, with some exceptions in > Kafka > > >>> Connect > > >>> and MirrorMaker do not follow this convention. I would tend to err > on the > > >>> side of > > >>> consistency with the existing metrics and not use `.ms`. However, > that’s > > >>> just my > > >>> opinion, so I’d be interested to know what other reviewers of the KIP > > >>> think. > > >>> > > >>> Thanks, > > >>> Andrew > > >>> > > >>>> On 12 Jul 2024, at 20:11, Brenden Deluna > <bdel...@confluent.io.INVALID > > >>> > > >>> wrote: > > >>>> > > >>>> Hey Lianet, > > >>>> > > >>>> Thank you for your suggestions and feedback! > > >>>> > > >>>> > > >>>> LM1. This has now been addressed. > > >>>> > > >>>> > > >>>> LM2. I think that would be a valuable addition to the current set of > > >>>> metrics, I will get that added. > > >>>> > > >>>> > > >>>> LM3. Again great idea, that would certainly be helpful. Will add > that > > >> as > > >>>> well. > > >>>> > > >>>> > > >>>> Let me know if you have any more suggestions! > > >>>> > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Brenden > > >>>> > > >>>> On Fri, Jul 12, 2024 at 2:11 PM Brenden Deluna < > bdel...@confluent.io> > > >>> wrote: > > >>>> > > >>>>> Hi Lucas, > > >>>>> > > >>>>> Thank you for the feedback! I have addressed your comments: > > >>>>> > > >>>>> > > >>>>> LB1. Good catch there, I will update the names as needed. > > >>>>> > > >>>>> > > >>>>> LB2. Good catch again! I will update the name to be more > consistent. > > >>>>> > > >>>>> > > >>>>> LB3. Thank you for pointing this out, I realized that all metric > > >> values > > >>>>> will actually be set to 0. I will specifiy this and explain why > they > > >>> will > > >>>>> be 0. > > >>>>> > > >>>>> > > >>>>> Nit: This metric is referring to the queue of unsent requests in > the > > >>>>> NetworkClientDelegate. For the metric descriptions I am trying to > not > > >>>>> include too much of the implementation details, hence the reason > that > > >>>>> description is quite short. I cannot think of other ways to > describe > > >> the > > >>>>> metric without going deeper into the implementation, but please do > let > > >>> me > > >>>>> know if you have any ideas. > > >>>>> > > >>>>> > > >>>>> Thank you, > > >>>>> > > >>>>> Brenden > > >>>>> > > >>>>> On Fri, Jul 12, 2024 at 1:27 PM Lianet M. <liane...@gmail.com> > wrote: > > >>>>> > > >>>>>> Hey Brenden, thanks for the KIP! Great to get more visibility into > > >> the > > >>> new > > >>>>>> consumer. > > >>>>>> > > >>>>>> LM1. +1 on Lucas's suggestion for including the unit in the name, > > >> seems > > >>>>>> clearer and consistent (I do see several time metrics including > ms) > > >>>>>> > > >>>>>> LM2. What about a new metric for application-event-queue-time-ms. > It > > >>> would > > >>>>>> be a complement to the application-event-queue-size you're > proposing, > > >>> and > > >>>>>> it will tell us how long the events sit in the queue, waiting to > be > > >>>>>> processed (from the time the API call adds the event to the > queue, to > > >>> the > > >>>>>> time it's processed in the background thread). I find it would be > > >> very > > >>>>>> interesting. > > >>>>>> > > >>>>>> LM3. Thinking about the actual usage of > > >>>>>> "time-between-network-thread-poll-xxx" metric, I imagine it would > be > > >>>>>> helpful to know more about what could be impacting it. As I see > it, > > >> the > > >>>>>> network thread cadence could be mainly impacted by: 1- app event > > >>>>>> processing > > >>>>>> (generate requests), 2- network client poll (actual send/receive). > > >> For > > >>> 2, > > >>>>>> the new consumer reuses the same component as the legacy one, but > 1 > > >> is > > >>>>>> specific to the new consumer, so what about a metric > > >>>>>> for application-event-processing-time-ms (we could consider avg I > > >> would > > >>>>>> say). It would be the time that the network thread takes to > process > > >> all > > >>>>>> available events on each run. > > >>>>>> > > >>>>>> Cheers! > > >>>>>> Lianet > > >>>>>> > > >>>>>> On Fri, Jul 12, 2024 at 1:57 PM Lucas Brutschy > > >>>>>> <lbruts...@confluent.io.invalid> wrote: > > >>>>>> > > >>>>>>> Hey Brenden, > > >>>>>>> > > >>>>>>> thanks for the KIP! These will be great to observe and debug the > > >>>>>>> background thread of the new consumer. > > >>>>>>> > > >>>>>>> LB1. `time-between-network-thread-poll-max` → I see several > similar > > >>>>>>> metrics including the unit in the metric name (ms or us). We > could > > >>>>>>> consider this, although it's probably not strictly required. > > >> However, > > >>>>>>> at least the description should state the unit. Same for the > `avg` > > >>>>>>> version. > > >>>>>>> LB2. `unsent-requests-size` → Naming sounds a bit like it's > > >> referring > > >>>>>>> to the size of the request. How about > `unsent-request-queue-size` or > > >>>>>>> `unsent-request-count` or simply `unsent-requests`? > > >>>>>>> LB3. "the proposed metrics below will be set to null or 0." → > which > > >>>>>>> one will be set to null and which ones will be set to 0, and why? > > >>>>>>> > > >>>>>>> nit: "The current number of unsent requests in the consumer > > >> network" → > > >>>>>>> Seems to be missing something? > > >>>>>>> > > >>>>>>> Cheers, > > >>>>>>> Lucas > > >>>>>>> > > >>>>>>> On Fri, Jul 12, 2024 at 7:28 PM Brenden Deluna > > >>>>>>> <bdel...@confluent.io.invalid> wrote: > > >>>>>>>> > > >>>>>>>> Hi Andrew, > > >>>>>>>> Thank you for the feedback and your question. > > >>>>>>>> > > >>>>>>>> AS1. Great idea, I will get that added. > > >>>>>>>> > > >>>>>>>> AS2. For unsent-events-age-max, age will be calculated once the > > >> event > > >>>>>> is > > >>>>>>>> sent, so you are correct. > > >>>>>>>> > > >>>>>>>> AS3. I agree, I think that would be a helpful metric to add, > thank > > >>>>>> you! I > > >>>>>>>> will get that added. > > >>>>>>>> > > >>>>>>>> Please let me know if you have any additional feedback, > > >> suggestions, > > >>>>>> or > > >>>>>>>> questions. > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> Brenden > > >>>>>>>> > > >>>>>>>> On Fri, Jul 12, 2024 at 11:45 AM Andrew Schofield < > > >>>>>>> andrew_schofi...@live.com> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Hi Brenden, > > >>>>>>>>> Thanks for the KIP. It fills a gap in the metrics for the new > > >>>>>> consumer > > >>>>>>>>> nicely. > > >>>>>>>>> > > >>>>>>>>> AS1. If using the CLASSIC protocol, and thus the > > >>>>>> LegacyKafkaConsumer, > > >>>>>>>>> I would expect that the metrics do not exist at all. Maybe say > > >>>>>>> something > > >>>>>>>>> like > > >>>>>>>>> “These metrics are for the new consumer implementation using > the > > >>>>>>>>> CONSUMER protocol”. > > >>>>>>>>> > > >>>>>>>>> AS2. For unsent-events-age-max, when is the age calculated? For > > >>>>>>> example, > > >>>>>>>>> is it calculated at the time that the unsent event is removed > from > > >>>>>> the > > >>>>>>>>> list and sent, or does the metric reflect unsent events which > are > > >>>>>> still > > >>>>>>>>> enqueued? I suspect the former, but thought I’d check. > > >>>>>>>>> > > >>>>>>>>> AS3. I think that unsent-events-age-avg would also be > interesting > > >> to > > >>>>>>>>> get an idea of how long unsent events tend to sit around before > > >>>>>>> sending. > > >>>>>>>>> Of course, the question from AS2 would also apply to the > average. > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> Andrew > > >>>>>>>>> > > >>>>>>>>>> On 10 Jul 2024, at 17:44, Philip Nee > <p...@confluent.io.INVALID> > > >>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>> Hi all, > > >>>>>>>>>> > > >>>>>>>>>> This is the link to the KIP document. > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>> > > >>>>>> > > >>> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1068%3A+New+JMX+metrics+for+the+new+KafkaConsumer > > >>>>>>>>>> > > >>>>>>>>>> Any comment is appreciated, > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> On Tue, Jul 9, 2024 at 10:14 AM Brenden Deluna > > >>>>>>>>> <bdel...@confluent.io.invalid> > > >>>>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>>> Hello everyone, > > >>>>>>>>>>> > > >>>>>>>>>>> I would like to start the discussion thread for KIP-1068. > This > > >>>>>> is a > > >>>>>>>>>>> relatively small KIP, only proposing to add a couple of new > > >>>>>> metrics. > > >>>>>>>>>>> > > >>>>>>>>>>> If you have any suggestions or feedback, let me know, it > will be > > >>>>>>> much > > >>>>>>>>>>> appreciated. > > >>> > > >>> > > >>> > > >> > > > >