Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-12 Thread Federico Valeri
Hi Luke, thanks for the KIP. I think we miss the "dir" key in "remainingLogsToRecover" ObjectName. kafka.log:type=LogManager,name=remainingLogsToRecover,dir=([-._\/\w\d\s]+) kafka.log:type=LogManager,name=remainingSegmentsToRecover,dir=([-._\/\w\d\s]+),threadNum=([0-9]+) Example: Broker configu

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-03 Thread Luke Chen
Hi Jun, Thanks for the comment. I've updated the KIP as: 1. remainingLogsToRecover*y* -> remainingLogsToRecover 2. remainingSegmentsToRecover*y* -> remainingSegmentsToRecover 3. The description of remainingSegmentsToRecover: The remaining segments for the current log assigned to the recovery thre

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-03 Thread Jun Rao
Hi, Luke, Thanks for the explanation. 10. It makes sense to me now. Instead of using a longer name, perhaps we could keep the current name, but make the description clear that it's the remaining segments for the current log assigned to a thread. Also, would it be better to use ToRecover instead o

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-03 Thread Luke Chen
Hi Jun, > how do we implement kafka.log :type=LogManager,name=remainingSegmentsToRecovery,dir=([-._\/\w\d\s]+),threadNum=([0-9]+), which tracks at the segment level? It looks like the name is misleading. Suppose we have 2 log recovery threads (num.recovery.threads.per.data.dir=2), and 10 logs to

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-02 Thread Jun Rao
Hi, Luke, Thanks for the reply. 10. You are saying it's difficult to track the number of segments to recover. But how do we implement kafka.log:type=LogManager,name=remainingSegmentsToRecovery,dir=([-._\/\w\d\s]+),threadNum=([0-9]+), which tracks at the segment level? Jun On Thu, Jun 2, 2022 a

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-06-02 Thread Luke Chen
Hi Jun, Thanks for the comment. Yes, I've tried to work on this way to track the number of remaining segments, but it will change the design in UnifiedLog, so I only track the logs number. Currently, we will load all segments and recover those segments if needed "during creating UnifiedLog instan

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-25 Thread Jun Rao
Hi, Luke, Thanks for the KIP. Just one comment. 10. For kafka.log:type=LogManager,name=remainingLogsToRecovery, could we instead track the number of remaining segments? This monitors the progress at a finer granularity and is also consistent with the thread level metric. Thanks, Jun On Wed, Ma

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-25 Thread Tom Bentley
Thanks Luke! LGTM. On Sun, 22 May 2022 at 05:18, Luke Chen wrote: > Hi Tom and Raman, > > Thanks for your comments. > > > 1. There's not a JIRA for this KIP (or the JIRA link needs updating). > 2. Similarly the link to this discussion thread needs updating. > > Please update the links to JIRA an

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-21 Thread Luke Chen
Hi Tom and Raman, Thanks for your comments. > 1. There's not a JIRA for this KIP (or the JIRA link needs updating). 2. Similarly the link to this discussion thread needs updating. > Please update the links to JIRA and the discussion thread. Yes, thanks for the reminder. I've updated the KIP. >

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Raman Verma
Hi Luke, The change is useful and simple. Thanks. Please update the links to JIRA and the discussion thread. Best Regards, Raman Verma On Thu, May 19, 2022 at 8:57 AM Tom Bentley wrote: > > Hi Luke, > > Thanks for the KIP. I think the idea makes sense and would provide useful > observability of

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Tom Bentley
Hi Luke, Thanks for the KIP. I think the idea makes sense and would provide useful observability of log recovery. I have a few comments. 1. There's not a JIRA for this KIP (or the JIRA link needs updating). 2. Similarly the link to this discussion thread needs updating. 3. I wonder whether we nee

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-11 Thread Luke Chen
> And if people start using RemainingLogs and RemainingSegments and then REALLY FEEL like they need RemainingBytes, then we can always add it in the future. +1 Thanks James! Luke On Wed, May 11, 2022 at 3:57 PM James Cheng wrote: > Hi Luke, > > Thanks for the detailed explanation. I agree that

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-11 Thread James Cheng
Hi Luke, Thanks for the detailed explanation. I agree that the current proposal of RemainingLogs and RemainingSegments will greatly improve the situation, and that we can go ahead with the KIP as is. If RemainingBytes were straight-forward to implement, then I’d like to have it. But we can liv

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-10 Thread Luke Chen
Hi James and all, I checked again and I can see when creating UnifiedLog, we expected the logs/indexes/snapshots are in good state. So, I don't think we should break the current design to expose the `RemainingBytesToRecovery` metric. If there is no other comments, I'll start a vote within this we

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-06 Thread Luke Chen
Hi James, Thanks for your input. For the `RemainingBytesToRecovery` metric proposal, I think there's one thing I didn't make it clear. Currently, when log manager start up, we'll try to load all logs (segments), and during the log loading, we'll try to recover logs if necessary. And the logs load

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-04 Thread James Cheng
Hi Luke, Thanks for adding RemainingSegmentsToRecovery. Another thought: different topics can have different segment sizes. I don't know how common it is, but it is possible. Some topics might want small segment sizes to more granular expiration of data. The downside of RemainingLogsToRecovery

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-04 Thread Luke Chen
Hi devs, If there are no other comments, I'll start a vote tomorrow. Thank you. Luke On Sun, May 1, 2022 at 5:08 PM Luke Chen wrote: > Hi James, > > Sorry for the late reply. > > Yes, this is a good point, to know how many segments to be recovered if > there are some large partitions. > I've u

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-01 Thread Luke Chen
Hi James, Sorry for the late reply. Yes, this is a good point, to know how many segments to be recovered if there are some large partitions. I've updated the KIP, to add a `*RemainingSegmentsToRecover*` metric for each log recovery thread, to show the value. The example in the Proposed section he

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-04-22 Thread James Cheng
The KIP describes RemainingLogsToRecovery, which seems to be the number of partitions in each log.dir. We have some partitions which are much much larger than others. Those large partitions have many many more segments than others. Is there a way the metric can reflect partition size? Could i

[DISCUSS] KIP-831: Add metric for log recovery progress

2022-04-20 Thread Luke Chen
Hi all, I'd like to propose a KIP to expose a metric for log recovery progress. This metric would let the admins have a way to monitor the log recovery progress. Details can be found here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress Any feedba