Hi All, Please let me know if there is any other feedback. Else will start a voting thread tomorrow.
Regards, Apoorv Mittal On Tue, Nov 5, 2024 at 12:21 PM Apoorv Mittal <apoorvmitta...@gmail.com> wrote: > Thanks for the feedback Andrew. > > AS9: Yes, you are correct it's topic name and not topic id. > > AS10: Thanks, I have updated the description. > > Regards, > Apoorv Mittal > > > On Tue, Nov 5, 2024 at 10:56 AM Andrew Schofield < > andrew_schofield_j...@outlook.com> wrote: > >> Hi Apoorv, >> A couple of additional comments. >> >> AS9: The tag share-partition : {group-topic-partition} confirm the >> format. I suppose it's >> one tag `group-id:topic-name:partition`, and not topic ID. Topic ID is >> clearly more >> authoritative, but I don't think it's usable in metrics. >> >> AS10: The description of the AcquisitionLockTimeoutMs would be better as >> "Tracks the number of acquisition locks for records which are not >> acknowledged within the timeout." >> >> Thanks, >> Andrew >> >> ________________________________________ >> From: Apoorv Mittal <apoorvmitta...@gmail.com> >> Sent: 05 November 2024 08:59 >> To: dev@kafka.apache.org <dev@kafka.apache.org> >> Subject: Re: [DISCUSS] KIP-1103: Additional metrics for cooperative >> consumption >> >> Hi All, >> Please let me know if there is any other feedback or suggestions. >> >> Regards, >> Apoorv Mittal >> >> >> On Fri, Nov 1, 2024 at 5:30 PM Apoorv Mittal <apoorvmitta...@gmail.com> >> wrote: >> >> > Thanks Jun and Andrew for reviewing and feedback. >> > >> > J1: I was thinking of having metrics as per the APIs in >> > SharePartitionManager but I think you are right. This is something not >> > beneficial to track as the release should only happen when a client >> sends a >> > leave group request. I have removed this metric. >> > >> > J2: Moved to Histogram. >> > >> > J3: Moved to Meter. I have removed RequestTopicPartitionsFetchEmptyCount >> > from KIP (as mentioned by Andrew - AS8) as >> RequestTopicPartitionsFetchRatio >> > moved to Histogram. Also I have marked InFlightMessageCount as Meter, so >> > shall capture the rate better. >> > >> > AS0: Done, specified specifically in the Motivation section. >> > >> > AS1: Hmmm, I thought about that earlier and was thinking about keeping >> the >> > abstracted share fetch APIs metrics in ShareGroupMetrics itself. >> However I >> > think the Share Fetch and Share Acknowledge can be part of >> BrokerTopicMetrics >> > as these operations happen by Share Consumer. Do you think we should >> also >> > move the RequestTopicPartitionsFetchRatio metric? I think we shouldn't >> and >> > it should be in ShareGroupMetrics itself, please let me know what you >> think. >> > >> > AS2: Yes you are right, it can support all topic rate as well. I have >> > added the suggestion. >> > >> > AS3: I have removed this metric as part of J1 comment. >> > >> > AS4,AS6: We already have a share-acknowledgement Meter metric defined in >> > KIP-932. I am of the opinion to move the KIP-932 share-acknowledgement >> > metric to TotalShareAcknowledgementRequestsPerSec. Hence I have added >> > corresponding FailedShareAcknowledgementRequestsPerSec in this KIP. >> > >> > AS5: It counts the failure of a share acknowledge request. Similar to >> > multi topic partition fetch, acknowledge can also be multi topic >> partition >> > but a single acknowledge request might either fail completely or at >> topic >> > partition level. Hence this metric tracks the failure as similar to >> > FailedFetchRequestsPerSec or FailedShareFetchRequestsPerSec. >> > >> > AS7: I have added some examples and another metric FetchLockTimeMs, >> which >> > should help in optimizing behaviour. >> > >> > AS8: Yes, we might be able to work with low percentiles to investigate >> > further hence I have removed RequestTopicPartitionsFetchEmptyCount. >> > However Histogram will give us percentiles but not the raw bucketed >> > value, in case we think we need such data explicitly then might consider >> > adding it but for now it's removed. >> > >> > >> > Regards, >> > Apoorv Mittal >> > >> > >> > On Thu, Oct 31, 2024 at 1:03 PM Andrew Schofield < >> > andrew_schofield_j...@outlook.com> wrote: >> > >> >> Hi Apoorv, >> >> Thanks for the KIP. I have some comments. >> >> >> >> AS0: These are broker metrics and not client metrics. I don't think the >> >> KIP states that and it probably should. >> >> >> >> AS1: TotalShareFetchRequestsPerSec is inspired by the existing metric >> >> TotalFetchRequestsPerSec. That one is in the BrokerTopicMetrics group >> >> so I would expect the new metric to be in the same group. This comment >> >> applies to all of the ShareGroupMetrics broker metrics scoped to a >> topic. >> >> >> >> AS2: Similarly to TotalFetchRequestPerSec, does the new metric support >> >> omitting the topic tag in order to obtain the all-topic rate? I >> believe so >> >> because it follows the existing metrics. I guess the tag is actually >> >> `topic=([-.\w]+)` to copy the existing notation. >> >> >> >> AS3: What's a share release? Does it count releases which are made >> >> by ShareFetch/Acknowledge requests as well as implicit ones as a >> >> result of share session expiration or closure? >> >> >> >> AS4: I see there is FailedShareAcknowledgementRequestsPerSec. >> >> If this corresponds specifically to the ShareAcknowledge API, then >> >> the name should be FailedShareAcknowledgeRequestsPerSec. >> >> >> >> AS5: What does FailedShareAcknowledgementRequestsPerSec >> >> actually count? This is a per-topic metric and a ShareAcknowledge >> >> can mention several topics, but actually it can also operate on many >> >> acquired records specifying any of the different acknowledgement >> >> types that KIP-932 defines. >> >> >> >> AS6: Why no TotalShareAcknowledgementRequestsPerSec? >> >> >> >> AS7: I think I like the concept behind >> RequestTopicPartitionsFetchRatio, >> >> but it would really help to include some examples of how it works. You >> >> mention 4 potential causes for low ratios, so it would be nice to see >> how >> >> each of them would be evident from the metrics and configurations. >> >> This is going to be valuable information for operating this in >> production. >> >> >> >> AS8: As Jun mentioned, maybe a histogram would be appropriate >> >> for RequestTopicPartitionsFetchRatio. Then I wonder whether it's >> >> worth having a separate RequestTopicPartitionsFetchEmptyCount since >> >> those are requests for which the fetch ratio was 0. >> >> >> >> Thanks, >> >> Andrew >> >> ________________________________________ >> >> From: Jun Rao <j...@confluent.io.INVALID> >> >> Sent: 30 October 2024 23:07 >> >> To: dev@kafka.apache.org <dev@kafka.apache.org> >> >> Subject: Re: [DISCUSS] KIP-1103: Additional metrics for cooperative >> >> consumption >> >> >> >> Hi, Apoorv, >> >> >> >> Thanks for the KIP. A few comments. >> >> >> >> 1. TotalShareReleaseRequestsPerSec: This is a bit weird since there is >> >> no ShareReleaseRequest. >> >> >> >> 2. RequestTopicPartitionsFetchRatio and InFlightBatchMessageCount: It >> >> seems >> >> Histogram is more appropriate for them. >> >> >> >> 3. RequestTopicPartitionsFetchEmptyCount and >> AcquisitionLockTimeoutCount: >> >> The problem with using gauge is that if the value changes quickly, we >> may >> >> not be able to capture the occurrence of empty fetch or lock timeout. >> >> Meter >> >> captures those rare events better. >> >> >> >> Jun >> >> >> >> >> >> On Tue, Oct 29, 2024 at 8:48 AM Apoorv Mittal < >> apoorvmitta...@gmail.com> >> >> wrote: >> >> >> >> > Hi Everyone, >> >> > I would like to start a discussion on KIP-1103: >> >> > >> >> > >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption >> >> > >> >> > This KIP extends KIP-932 to provide additional metrics for >> >> > Queues/Cooperative consumption. >> >> > >> >> > Regards, >> >> > Apoorv Mittal >> >> > >> >> >> > >> >