Hi Jun,

See my comments below.

On Thu, Mar 28, 2024 at 11:09 AM Jun Rao <j...@confluent.io.invalid> wrote:
> If I am adding a new voter and it takes a long time (because the new voter
> is catching up), I'd want to know if the request is indeed being processed.
> I thought that's the usage of uncommitted-voter-change.

They can get related information by using the 'kafka-metadata describe
--replication" command (or the log-end-offset metric from KIP-595).
That command (and metric) displays the LEO of all of the replicas
(voters and observers), according to the leader. They can use that
output to discover if the observer they are trying to add is lagging
or is not replicating at all.

When the user runs the command above, they don't know the exact offset
that the new controller needs to reach but they can do some rough
estimation of how far behind it is. What do you think? Is this good
enough?

> Also, I am still not sure about having multiple brokers reporting the same
> metric. For example, if they don't report the same value (e.g. because one
> broker is catching up), how does a user know which value is correct?

They are all correct according to the local view. Here are two
examples of monitors that the user can write:

1. Is there a voter that I need to remove from the quorum? They can
create a monitor that fires, if the number-of-offline-voters metric
has been greater than 0 for the past hour.
2. Is there a cluster that doesn't have 3 voters? They can create a
monitor that fires, if any replica doesn't report three for
number-of-voters for the past hour.

Is there a specific metric that you have in mind that should only be
reported by the KRaft leader?

Thanks,
-- 
-José

Reply via email to