Thanks Onur. That’s correct, we no longer nee that extra mapping. I’ll update the KIP. B
Ben Stopford Confluent, http://www.confluent.io <http://www.confluent.io/> > On 11 Dec 2016, at 23:54, Onur Karaman <onurkaraman.apa...@gmail.com> wrote: > > Pretty happy to see a KIP tackling this problem! One comment below. > > The "Extending LeaderEpoch to include Returning Leaders" states: > "To protect against this eventuality the controller will maintain a cached > mapping of [broker -> Zookeeper CZXID] (CZXID is a unique and monotonic > 64-bit number) for the broker’s registration in Zookeeper > (/brokers/ids/[brokerId]). If the controller receives a Broker Registration > where the CZXID has changed it will increment the Leader Epoch and > propagate that value to the broker via the Leader and ISR Request (in the > normal way), then update the cached CZXID for that broker." > > In general I think kafka underutilizes zookeeper's various flavors of zxids > but this time it's not clear to me what the motivation is for maintaining > the broker to czxid mapping. It seems that the following check is > redundant: "If the controller receives a Broker Registration where the > CZXID has changed". By definition, the czxid of the /brokers/ids/[brokerId] > znode will always change upon successful broker registration ( > https://zookeeper.apache.org/doc/r3.4.8/zookeeperProgrammers.html#sc_zkStatStructure). > Why maintain the mapping at all? Why not just always update leader epochs > and propagate every time the controller receives the broker registration zk > event? > > On Sun, Dec 11, 2016 at 2:30 PM, Neha Narkhede <n...@confluent.io> wrote: > >> Good to see this KIP being proposed. Back when I added the epoch to the >> replication protocol, we discussed adding it to the log due to the failure >> scenarios listed in the KIP but I failed to convince people that it was >> worth the effort needed to upgrade the cluster (especially after we asked >> people to go through a painful backwards incompatible upgrade for 0.8 :-)) >> The lack of including the leader epoch/generation in the log has also been >> one of the biggest critiques of Kafka's replication protocol by the >> distributed systems community. >> >> I'm in favor of this work though I think we shouldn't end up with 2 notions >> of representing a leader's generation. When we added the epoch, we wanted >> to add it to the log but we didn't. Now that we are adding the generation >> id to the log, I think we should revisit calling it the epoch at all. Have >> you thought about a way to evolve the epoch to the generation id throughout >> and what it will take? >> >> On Sun, Dec 11, 2016 at 4:31 AM Ben Stopford <b...@confluent.io> wrote: >> >>> Hi All >>> >>> Please find the below KIP which describes a proposed solution to a couple >>> of issues that have been observed with the replication protocol. >>> >>> In short, the proposal replaces the use of the High Watermark, for >>> follower log trunctation, with an alternate Generation Marker. This >>> uniquely defines which leader messages were acknowledged by. >>> >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> 101+-+Alter+Replication+Protocol+to+use+Leader+ >> Generation+rather+than+High+Watermark+for+Truncation >>> < >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> 101+-+Alter+Replication+Protocol+to+use+Leader+ >> Generation+rather+than+High+Watermark+for+Truncation >>>> >>> >>> All comments and suggestions greatly appreciated. >>> >>> Ben Stopford >>> Confluent, http://www.confluent.io <http://www.confluent.io/> >>> >>> -- >> Thanks, >> Neha >>