Hi all,
Thanks Juha for bringing this discussion here. To everyone else, I am
Juha's colleague at Aiven and am currently working on introducing the kind
of tooling discussed in this thread, to be used in worst case scenarios. I
have a proof of concept working. The various input and concerns raised
Hi Luke and Colin,
On Mon, Apr 7, 2025 at 10:29 PM Luke Chen wrote:
> That's why we were discussing if there's any way to "force" recover the
> scenario, even if it's possible to have data loss.
Yes. There is a way. They need to configure a controller cluster that
matches the voter set in the cl
On Mon, Apr 7, 2025, at 19:29, Luke Chen wrote:
> Hi Jose and Colin,
>
> Thanks for your explanation!
>
> Yes, we all agree that 3 node quorum can only tolerate 1 node down.
> We just want to discuss, "what if" 2 out of 3 nodes are down at the same
> time, what can we do?
> Currently, the result is
Hi Jose and Colin,
Thanks for your explanation!
Yes, we all agree that 3 node quorum can only tolerate 1 node down.
We just want to discuss, "what if" 2 out of 3 nodes are down at the same
time, what can we do?
Currently, the result is that the quorum will never form and all the kafka
cluster is
Hi José,
I think you make a valid point that our guarantees here are not actually
different from zookeeper. In both systems, if you lose quorum, you will
probably lose some data. Of course, how much data you lose depends on luck. If
the last node standing was the active controller / zookeeper,
Thanks Luke.
On Thu, Apr 3, 2025 at 7:14 AM Luke Chen wrote:
> In addition to the approaches you provided, maybe we can have a way to
> "force" KRaft to honor "controller.quorum.voters" config, instead of
> "controller.quorum.bootstrap.servers", even it's in kraft.version 1.
Small clarification.
HI Juha,
That's for the discussion.
On Thu, Apr 3, 2025 at 4:08 AM Juha Mynttinen
wrote:
>Consider the following Kafka controller setup. There are three controllers
> c1, c2 and c3, each on its own hardware. All controllers are voters and
> let’s assume c1 is the leader. Assume new servers can b
Consider the following Kafka controller setup. There are three controllers
c1, c2 and c3, each on its own hardware. All controllers are voters and
let’s assume c1 is the leader. Assume new servers can be added as needed to
replace broken one, but broken/lost servers cannot be brought back. If a
new
Hi Juha,
Thanks for bringing this.
I agree having a way to recover from this "majority of controller nodes
down" issue is valuable, even though this is rare.
In addition to the approaches you provided, maybe we can have a way to
"force" KRaft to honor "controller.quorum.voters" config, instead of