Hi Alyssa,

1. In the schema for VoteRequest and VoteResponse, you are using
"boolean" as the type keyword. The correct keyword should be "bool"
instead.

2. In the states and state transaction table you have the following entry:
>  * Candidate transitions to:
> *    ...
> *    Prospective: After expiration of the election timeout

Can you explain the reason a candidate would transition back to
prospective? If a voter transitions to the candidate state it is
because the voters don't support KIP-996 or the replica was able to
win the majority of the votes at some point in the past. Are we
concerned that the network partition might have occurred after the
replica has become a candidate? If so, I think we should state this
explicitly in the KIP.

3. In the proposed section and state transition section, I think it
would be helpful to explicitly state that we have an invariant that
only the prospective state can transition to the candidate state. This
transition to the candidate state from the prospective state can only
happen because the replica won the majority of the votes or there is
at least one remote voter that doesn't support pre-vote.

4. I am a bit confused by this paragraph
> A candidate will now send a VoteRequest with the PreVote field set to true 
> and CandidateEpoch set to its [epoch + 1] when its election timeout expires. 
> If [majority - 1] of VoteResponse grant the vote, the candidate will then 
> bump its epoch up and send a VoteRequest with PreVote set to false which is 
> our standard vote that will cause state changes for servers receiving the 
> request.

I am assuming that "candidate" refers to the states enumerated on the
table above this quote. If so, I think you mean "prospective" for the
first candidate.

CandidateEpoch should be ReplicaEpoch.

[epoch + 1] should just be epoch. I thought we agreed that replicas
will always send their current epoch to the remote replicas.

5. I am a bit confused by this bullet section
> true if the server receives less than [majority] VoteResponse with 
> VoteGranted set to false within [election.timeout.ms + a little randomness] 
> and the first bullet point does not apply
     Explanation for why we don't send a standard vote at this point
is explained in rejected alternatives.

Can we explain this case in plain english? I assume that this case is
trying to cover the scenario where the election timer expired but the
prospective candidate hasn't received enough votes (granted or
rejected) to make a decision if it could win an election.

6.
> Yes. If a leader is unable to receive fetch responses from a majority of 
> servers, it can impede followers that are able to communicate with it from 
> voting in an eligible leader that can communicate with a majority of the 
> cluster.

In general, leaders don't receive fetch responses. They receive FETCH
requests. Did you mean "if a leader is able to send FETCH responses to
the majority - 1 of the voters, it can impede fetching voters
(followers) from granting their vote to prospective candidates. This
should stop prospective candidates from getting enough votes to
transition to the candidate state and increase their epoch".

7.
> Check Quorum ensures a leader steps down if it is unable to receive fetch 
> responses from a majority of servers.

I think you mean "... if it is unable to receive FETCH requests from
the majority - 1 of the voters".

8. At the end of the Proposed changes section you have the following:
> The logic now looks like the following for servers receiving VoteRequests 
> with PreVote set to true:
>
> When servers receive VoteRequests with the PreVote field set to true, they 
> will respond with VoteGranted set to
>
> * true if they are not a Follower and the epoch and offsets in the Pre-Vote 
> request satisfy the same requirements as a standard vote
> * false if they are a Follower or the epoch and end offsets in the Pre-Vote 
> request do not satisfy the requirements

This seems to duplicate the same algorithm that was stated earlier in
the section.

9. I don't understand this rejected idea: Sending Standard Votes after
failure to win Pre-Vote

In your example in the "Disruptive server scenarios" voters 4 and 5
are partitioned from the majority of the voters. We don't want voters
4 and 5 increasing their epoch and transitioning to the candidate
state else they would disrupt the quorum established by voters 1, 2
and 3.


Thanks,
-- 
-José

Reply via email to