Thanks for the review David.

Here are the answers to your questions. I will update the KIP to make the info 
clearer.

> 1) Does "publisher-error-count" represent the number of errors
> encountered only when loading the most recent image? Or will this value be
> the cumulative number of publisher errors since the broker started?
> 2) Same question for "listener-batch-load-error-count"
The intent is to have a cumulative number for both of these. The rationale is 
that any fault in loading an image (even if a subsequent load was OK) is worthy 
of inspection. It would be good to have a way to bring the count down to zero 
through an operator initiated signal, but that could be a follow up.

> 3) Will ForceRenounceCount be zero for non-leader controllers? Or will this
> value remain between elections and only get reset to zero upon a restart
I think it makes sense to keep these metrics for all controllers in the system. 
A forced resignation is usually looked at after it has happened, and at that 
point, the controller might not be the leader anymore.

> On Jul 27, 2022, at 11:39 AM, David Arthur 
> <david.art...@confluent.io.invalid> wrote:
> 
> Thanks for the KIP, Niket! I definitely agree we need to surface metadata
> processing errors to the operator. I have some questions about the
> semantics of the new metrics:
> 
> 1) Does "publisher-error-count" represent the number of errors
> encountered only when loading the most recent image? Or will this value be
> the cumulative number of publisher errors since the broker started?
> 2) Same question for "listener-batch-load-error-count"
> 3) Will ForceRenounceCount be zero for non-leader controllers? Or will this
> value remain between elections and only get reset to zero upon a restart
> 
> Thanks!
> David
> 
> On Wed, Jul 27, 2022 at 2:20 PM Niket Goel <ng...@confluent.io.invalid>
> wrote:
> 
>> 
>> Hi all,
>> 
>> I would like to start a discussion on adding some new metrics to KRaft to
>> allow for better visibility into log processing errors.
>> 
>> KIP URL:
>> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/KAFKA/KIP-859%253A%2BAdd%2BMetadata%2BLog%2BProcessing%2BError%2BRelated%2BMetrics&source=gmail-imap&ust=1659551965000000&usg=AOvVaw2Uzcu-JIs-OZSdfTavNjn7
>> 
>> Thanks!
>> Niket
>> 
>> 
> 
> -- 
> -David

Reply via email to