Hi All:

      We are using Kafka 0.8.2.2 in our sit enviornment, and meeting an
data lose case when all brokers (2 brokers) going down and restart again. I
am tring to understand the log management and recovery mechanism in kafka
and i found a useful description document: KIP-101 - Alter Replication
Protocol to use Leader Epoch rather than High Watermark for Truncation (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation)
.  However, i meet some difficulties to understand the "Scenario 1: High
Watermark Truncation followed by Immediate Leader Election" description in
this article.  In this scenario, leader B has update HW to m2, however,
follower A just got m2, but not update its local HW to m2, and ”the
follower (A) has message m2, but has not yet got confirmation from the
leader (B) that m2 has been committed (the second round of replication,
which lets (A) move forward its high watermark past m2, has yet to happen)“

    Is that possible?  Since there is only 2 brokers, and i think, leader B
update HW to m2 only if follower A fetch m2, and also, when follower A
fetch m2, leader B update HW to m2, follower A will get this updated HW(m2)
infomation in the m2's fetchMessage response  (with HighwaterMarkOffset
field), it won't need to get any confirmation on the second round as
article methioned. I am confusing about this part. I think if there is
another broker follower C which fetch fetch m2 later than follower A, that
would lead to follower A waiting for second round of replication to confirm
HW(m2), but there is no broker c in this description

   am i missing some procedure or there is some flaw for this description
about the case?  any explanations are appreciated, thanks all kafka
developers to bring us such a greate production.


Alex.Chen

Reply via email to