Hi Jun, See my comments below.
On Thu, Mar 28, 2024 at 11:09 AM Jun Rao <j...@confluent.io.invalid> wrote: > If I am adding a new voter and it takes a long time (because the new voter > is catching up), I'd want to know if the request is indeed being processed. > I thought that's the usage of uncommitted-voter-change. They can get related information by using the 'kafka-metadata describe --replication" command (or the log-end-offset metric from KIP-595). That command (and metric) displays the LEO of all of the replicas (voters and observers), according to the leader. They can use that output to discover if the observer they are trying to add is lagging or is not replicating at all. When the user runs the command above, they don't know the exact offset that the new controller needs to reach but they can do some rough estimation of how far behind it is. What do you think? Is this good enough? > Also, I am still not sure about having multiple brokers reporting the same > metric. For example, if they don't report the same value (e.g. because one > broker is catching up), how does a user know which value is correct? They are all correct according to the local view. Here are two examples of monitors that the user can write: 1. Is there a voter that I need to remove from the quorum? They can create a monitor that fires, if the number-of-offline-voters metric has been greater than 0 for the past hour. 2. Is there a cluster that doesn't have 3 voters? They can create a monitor that fires, if any replica doesn't report three for number-of-voters for the past hour. Is there a specific metric that you have in mind that should only be reported by the KRaft leader? Thanks, -- -José