Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-31 Thread Jonah Hooper
Thanks for changes to the KIP; this looks good to me! Best, Jonah On Tue, Jul 22, 2025 at 3:31 PM Mahsa Seifikar wrote: > Hi Kevin, > > You're right. I wanted to change the formula format and forgot to switch it > back to the original formula, which should be `idle_time/total_time`. I've > upda

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-22 Thread Mahsa Seifikar
Hi Kevin, You're right. I wanted to change the formula format and forgot to switch it back to the original formula, which should be `idle_time/total_time`. I've updated KIP. Thanks, Mahsa Seifikar On Tue, Jul 22, 2025 at 2:18 PM Kevin Wu wrote: > Hi Mahsa, > > I see you have the definition of

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-22 Thread Kevin Wu
Hi Mahsa, I see you have the definition of the metric value as: controller idle ratio = idle_time/active_time Shouldn't the value for a ratio be: controller idle ratio = idle_time/total_time where total_time = idle_time + active_time? This lines up with the definition you outlined earlier in the

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-22 Thread Mahsa Seifikar
Thanks Jonah and Kevin for the feedback. I have updated the KIP accordingly. We ideally want to use something like the TimeRatio type for this metric, similar to how "poll-idle-ratio" is measured in KafkaRaftMetrics. Please let me know if you have any further feedback. Best, Mahsa Seifikar On F

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-11 Thread Kevin Wu
Hi Mahsa and Jonah, Since we're adding this new metric to a metrics group that is still using Yammer, ideally I think we want to use RatioGauge to give us the sampling functionality we need. It's possible that we can get similar functionality from Histogram, which I know other Yammer metrics in Ka

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-10 Thread Jonah Hooper
Thanks for responding to feedback. Had a few more points. > The idle_time can be part of active_time in specific scenarios - > particularly when the controller has sent records to the Raft layer and is > awaiting commitment confirmation, but not when simply waiting for new > events to enter the q

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-10 Thread Mahsa Seifikar
Hi Jonah and Kevin, Thanks for your comments. I have now updated the KIP to address your feedback. Please let me know if you have any further questions. Best, Mahsa Seifikar On Thu, Jul 3, 2025 at 4:40 PM Mahsa Seifikar wrote: > Hello all, > > I wrote a short KIP to add a new metric for contr

Re: Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-10 Thread Mahsa Seifikar
Hi Jonah and Kevin, Thanks for your comments. I have now updated the KIP to address your feedback. Please let me know if you have any further questions. Best, Mahsa Seifikar On Thu, Jul 10, 2025 at 10:45 AM Kevin Wu wrote: > Hi Mahsa, > > Thanks for the KIP. I think there was an issue with my

RE: Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-10 Thread Kevin Wu
Hi Mahsa, Thanks for the KIP. I think there was an issue with my original reply since it is not showing up on the thread. Trying again. In the Motivation section, can we state why the current metrics involving the controller's event queue thread -- time spent in the queue and process time -- are

Re: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-07 Thread Jonah Hooper
Thanks for the KIP, Mahsa. Have one initial question: > The ratio of time the controller thread is idle relative to the total time > (idle+active). How is the active and idle time calculated? Is it in total over the time period in which the controller is active? Or is there a specific window pe

RE: [DISCUSS] KIP-1190: Add a metric for controller thread idleness

2025-07-07 Thread Kevin Wu
Hi Mahsa, Thanks for the KIP. In the Motivation section, can we state why the current metrics involving the controller's event queue thread -- time spent in the queue and process time -- are not sufficient? Can we also match the naming style of those other event queue metrics for consistency (i.e