kevin-wu24 commented on PR #18987: URL: https://github.com/apache/kafka/pull/18987#issuecomment-2682522812
@ahuang98 thanks for the quick review. I took another look into that invariant and I'll address the last two comments here since the situation is a bit complicated. @jsancio too since this may be missing in the current KIP-853 implementation. Let me know what y'all think: I agree, we should only be checking this invariant using the leader's `lastVoterSet()`. When I remove the extra case checking the committed voter set (voter set at HWM - 1), we fail this invariant check in the following addVoter scenario: - We're adding a voter that will increase the number of nodes that are considered a majority (i.e. old voter set = {0}, new voter set = {0, 1}) - The events involved are: `SequentialAppendAction`, an event which repeatedly gets executed, and my implementation of `AddVoterAction`. `SequentialAppendAction` just increments a counter and writes these records to the log using `client.prepareAppend` and `client.schedulePreparedAppend`. `AddVoterAction` creates two RaftMessages and adds them to the leader's message queue, the `AddVoterRequest` and `API_VERSIONS_RESPONSE`. The problem is that I use `CompleteableFuture#whenComplete` to set the `correlationId` of the `API_VERSIONS_RESPONSE`, which is asynchronous and exit `run()` after these two messages are added to the leader's queue (i.e. the `AddVoterRequest#run()` can finish execution without completing the future, and perhaps more importantly, before the new VoterSet is added to the log, allowing subsequent `SequentialAddActions` to execute even though they shouldn't be allowed to yet). - This section of the KIP says we should not commit records while AddVoterRequest is being handled: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217391519#KIP853:KRaftControllerMembershipChanges-Handling.3 - I believe the easiest way to solve this is to not allow `AddVoterAction#run()` to return until we get back an `AddVoterResponse` from the leader. - I want to make sure that this isn't possible in the current implementation. From looking at the raft code and the fact that this is possible in the simulation tests it looks like there is no guard against this in the raft layer (i.e. my proposed fix for the simulation test I think is considered a higher layer fix, since I'm just not allowing the handling of `AddVoterRequest` and `prepareAppend + schedulePrepareAppend` to happen concurrently, but I'm not seeing anything that prevents this in the source right now). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org