Hi Jun and Chia,
Thanks for the feedback.

10. Done.

11. I have moved FetchLockTimeMs to Histogram. I didn't understand
which ratio shall FetchLockRatio Metric should hold i.e. against what value
we should compare the partitions locked. Though I have added Metric
RequestTopicPartitionsFetchRatio which provides a ratio of partition locked
vs requested. Shall we have something else as well?

12. Yeah, true. Done.

AS0: Make sense, I have moved to group, topic, partition tags as like
Kafka.cluster Partition metrics.

AS1: Successful AcknowledgementRequests are also important and they are
already defined in KIP-932 broker metrics. We shall change the metric name
for successful AcknowledgementRequests in KIP-932 itself while implementing
these metrics.

Regards,
Apoorv Mittal


On Thu, Nov 7, 2024 at 7:36 PM Chia-Ping Tsai <chia7...@apache.org> wrote:

> hi Apoorv
>
> sorry for late response:
>
> AS0: How can users distinguish between the group and topic in the format
> share-partition={group-topic-partition} since `-` is a valid character in
> both group and topic?
>
> AS1: Why do we only have metrics for failed AcknowledgementRequests?
> Aren't successful AcknowledgementRequests also important for metrics?
>
> Thanks,
> Chia-Ping
>
> On 2024/11/06 13:58:08 Apoorv Mittal wrote:
> > Hi All,
> > Please let me know if there is any other feedback. Else will start a
> voting
> > thread tomorrow.
> >
> > Regards,
> > Apoorv Mittal
> >
> >
> > On Tue, Nov 5, 2024 at 12:21 PM Apoorv Mittal <apoorvmitta...@gmail.com>
> > wrote:
> >
> > > Thanks for the feedback Andrew.
> > >
> > > AS9: Yes, you are correct it's topic name and not topic id.
> > >
> > > AS10: Thanks, I have updated the description.
> > >
> > > Regards,
> > > Apoorv Mittal
> > >
> > >
> > > On Tue, Nov 5, 2024 at 10:56 AM Andrew Schofield <
> > > andrew_schofield_j...@outlook.com> wrote:
> > >
> > >> Hi Apoorv,
> > >> A couple of additional comments.
> > >>
> > >> AS9: The tag share-partition : {group-topic-partition} confirm the
> > >> format. I suppose it's
> > >> one tag `group-id:topic-name:partition`, and not topic ID. Topic ID is
> > >> clearly more
> > >> authoritative, but I don't think it's usable in metrics.
> > >>
> > >> AS10: The description of the AcquisitionLockTimeoutMs would be better
> as
> > >> "Tracks the number of acquisition locks for records which are not
> > >> acknowledged within the timeout."
> > >>
> > >> Thanks,
> > >> Andrew
> > >>
> > >> ________________________________________
> > >> From: Apoorv Mittal <apoorvmitta...@gmail.com>
> > >> Sent: 05 November 2024 08:59
> > >> To: dev@kafka.apache.org <dev@kafka.apache.org>
> > >> Subject: Re: [DISCUSS] KIP-1103: Additional metrics for cooperative
> > >> consumption
> > >>
> > >> Hi All,
> > >> Please let me know if there is any other feedback or suggestions.
> > >>
> > >> Regards,
> > >> Apoorv Mittal
> > >>
> > >>
> > >> On Fri, Nov 1, 2024 at 5:30 PM Apoorv Mittal <
> apoorvmitta...@gmail.com>
> > >> wrote:
> > >>
> > >> > Thanks Jun and Andrew for reviewing and feedback.
> > >> >
> > >> > J1: I was thinking of having metrics as per the APIs in
> > >> > SharePartitionManager but I think you are right. This is something
> not
> > >> > beneficial to track as the release should only happen when a client
> > >> sends a
> > >> > leave group request. I have removed this metric.
> > >> >
> > >> > J2: Moved to Histogram.
> > >> >
> > >> > J3: Moved to Meter. I have removed
> RequestTopicPartitionsFetchEmptyCount
> > >> > from KIP (as mentioned by Andrew - AS8) as
> > >> RequestTopicPartitionsFetchRatio
> > >> > moved to Histogram. Also I have marked InFlightMessageCount as
> Meter, so
> > >> > shall capture the rate better.
> > >> >
> > >> > AS0: Done, specified specifically in the Motivation section.
> > >> >
> > >> > AS1: Hmmm, I thought about that earlier and was thinking about
> keeping
> > >> the
> > >> > abstracted share fetch APIs metrics in ShareGroupMetrics itself.
> > >> However I
> > >> > think the Share Fetch and Share Acknowledge can be part of
> > >> BrokerTopicMetrics
> > >> > as these operations happen by Share Consumer. Do you think we should
> > >> also
> > >> > move the RequestTopicPartitionsFetchRatio metric? I think we
> shouldn't
> > >> and
> > >> > it should be in ShareGroupMetrics itself, please let me know what
> you
> > >> think.
> > >> >
> > >> > AS2: Yes you are right, it can support all topic rate as well. I
> have
> > >> > added the suggestion.
> > >> >
> > >> > AS3: I have removed this metric as part of J1 comment.
> > >> >
> > >> > AS4,AS6: We already have a share-acknowledgement Meter metric
> defined in
> > >> > KIP-932. I am of the opinion to move the KIP-932
> share-acknowledgement
> > >> > metric to TotalShareAcknowledgementRequestsPerSec. Hence I have
> added
> > >> > corresponding FailedShareAcknowledgementRequestsPerSec in this KIP.
> > >> >
> > >> > AS5: It counts the failure of a share acknowledge request. Similar
> to
> > >> > multi topic partition fetch, acknowledge can also be multi topic
> > >> partition
> > >> > but a single acknowledge request might either fail completely or at
> > >> topic
> > >> > partition level. Hence this metric tracks the failure as similar to
> > >> > FailedFetchRequestsPerSec or FailedShareFetchRequestsPerSec.
> > >> >
> > >> > AS7: I have added some examples and another metric FetchLockTimeMs,
> > >> which
> > >> > should help in optimizing behaviour.
> > >> >
> > >> > AS8: Yes, we might be able to work with low percentiles to
> investigate
> > >> > further hence I have removed RequestTopicPartitionsFetchEmptyCount.
> > >> > However Histogram will give us percentiles but not the raw bucketed
> > >> > value, in case we think we need such data explicitly then might
> consider
> > >> > adding it but for now it's removed.
> > >> >
> > >> >
> > >> > Regards,
> > >> > Apoorv Mittal
> > >> >
> > >> >
> > >> > On Thu, Oct 31, 2024 at 1:03 PM Andrew Schofield <
> > >> > andrew_schofield_j...@outlook.com> wrote:
> > >> >
> > >> >> Hi Apoorv,
> > >> >> Thanks for the KIP. I have some comments.
> > >> >>
> > >> >> AS0: These are broker metrics and not client metrics. I don't
> think the
> > >> >> KIP states that and it probably should.
> > >> >>
> > >> >> AS1: TotalShareFetchRequestsPerSec is inspired by the existing
> metric
> > >> >> TotalFetchRequestsPerSec. That one is in the BrokerTopicMetrics
> group
> > >> >> so I would expect the new metric to be in the same group. This
> comment
> > >> >> applies to all of the ShareGroupMetrics broker metrics scoped to a
> > >> topic.
> > >> >>
> > >> >> AS2: Similarly to TotalFetchRequestPerSec, does the new metric
> support
> > >> >> omitting the topic tag in order to obtain the all-topic rate? I
> > >> believe so
> > >> >> because it follows the existing metrics. I guess the tag is
> actually
> > >> >> `topic=([-.\w]+)` to copy the existing notation.
> > >> >>
> > >> >> AS3: What's a share release? Does it count releases which are made
> > >> >> by ShareFetch/Acknowledge requests as well as implicit ones as a
> > >> >> result of share session expiration or closure?
> > >> >>
> > >> >> AS4: I see there is FailedShareAcknowledgementRequestsPerSec.
> > >> >> If this corresponds specifically to the ShareAcknowledge API, then
> > >> >> the name should be FailedShareAcknowledgeRequestsPerSec.
> > >> >>
> > >> >> AS5: What does FailedShareAcknowledgementRequestsPerSec
> > >> >> actually count? This is a per-topic metric and a ShareAcknowledge
> > >> >> can mention several topics, but actually it can also operate on
> many
> > >> >> acquired records specifying any of the different acknowledgement
> > >> >> types that KIP-932 defines.
> > >> >>
> > >> >> AS6: Why no TotalShareAcknowledgementRequestsPerSec?
> > >> >>
> > >> >> AS7: I think I like the concept behind
> > >> RequestTopicPartitionsFetchRatio,
> > >> >> but it would really help to include some examples of how it works.
> You
> > >> >> mention 4 potential causes for low ratios, so it would be nice to
> see
> > >> how
> > >> >> each of them would be evident from the metrics and configurations.
> > >> >> This is going to be valuable information for operating this in
> > >> production.
> > >> >>
> > >> >> AS8: As Jun mentioned, maybe a histogram would be appropriate
> > >> >> for RequestTopicPartitionsFetchRatio. Then I wonder whether it's
> > >> >> worth having a separate RequestTopicPartitionsFetchEmptyCount since
> > >> >> those are requests for which the fetch ratio was 0.
> > >> >>
> > >> >> Thanks,
> > >> >> Andrew
> > >> >> ________________________________________
> > >> >> From: Jun Rao <j...@confluent.io.INVALID>
> > >> >> Sent: 30 October 2024 23:07
> > >> >> To: dev@kafka.apache.org <dev@kafka.apache.org>
> > >> >> Subject: Re: [DISCUSS] KIP-1103: Additional metrics for cooperative
> > >> >> consumption
> > >> >>
> > >> >> Hi, Apoorv,
> > >> >>
> > >> >> Thanks for the KIP. A few comments.
> > >> >>
> > >> >> 1. TotalShareReleaseRequestsPerSec: This is a bit weird since
> there is
> > >> >> no ShareReleaseRequest.
> > >> >>
> > >> >> 2. RequestTopicPartitionsFetchRatio and InFlightBatchMessageCount:
> It
> > >> >> seems
> > >> >> Histogram is more appropriate for them.
> > >> >>
> > >> >> 3. RequestTopicPartitionsFetchEmptyCount and
> > >> AcquisitionLockTimeoutCount:
> > >> >> The problem with using gauge is that if the value changes quickly,
> we
> > >> may
> > >> >> not be able to capture the occurrence of empty fetch or lock
> timeout.
> > >> >> Meter
> > >> >> captures those rare events better.
> > >> >>
> > >> >> Jun
> > >> >>
> > >> >>
> > >> >> On Tue, Oct 29, 2024 at 8:48 AM Apoorv Mittal <
> > >> apoorvmitta...@gmail.com>
> > >> >> wrote:
> > >> >>
> > >> >> > Hi Everyone,
> > >> >> > I would like to start a discussion on KIP-1103:
> > >> >> >
> > >> >> >
> > >> >>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption
> > >> >> >
> > >> >> > This KIP extends KIP-932 to provide additional metrics for
> > >> >> > Queues/Cooperative consumption.
> > >> >> >
> > >> >> > Regards,
> > >> >> > Apoorv Mittal
> > >> >> >
> > >> >>
> > >> >
> > >>
> > >
> >
>

Reply via email to