[ https://issues.apache.org/jira/browse/KUDU-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shenxingwuying reassigned KUDU-3383: ------------------------------------ Assignee: shenxingwuying > About strong consistency read from leader > ----------------------------------------- > > Key: KUDU-3383 > URL: https://issues.apache.org/jira/browse/KUDU-3383 > Project: Kudu > Issue Type: Improvement > Reporter: shenxingwuying > Assignee: shenxingwuying > Priority: Major > Attachments: image-2022-07-20-23-14-34-519.png, > image-2022-07-20-23-17-40-718.png > > > As describe as https://issues.apache.org/jira/browse/KUDU-3382. > > > h1. Background && Motivation > Linearizability read is a very friendly for developers, kudu can support it. > h1. Issue of linearizability read from leader > We need talk about the issue. > Kudu's raft implements is a strong leader, leader's state machine is not > older than followers, and followers heartbeat timeout or receives leader > election request(leader transfer) can elect leader and switch leader. > If kudu need linearizability read, read leader is not enough, because double > leader may be exist at a very small period time. > I provide a scenarios. > > !image-2022-07-20-23-17-40-718.png! > > # A raft group has 3 replicas, L1, F2, F3. Their states is steady during > term 1. > # If network parition, F2 and F3 loss leader's heartbeat, F3 start election, > F2 vote it. > # F3 become Leader, we can call it L3. At this moment, there are 2 leaders: > L1(1) and L3(2). > # The state will be continued until the network partition recover. The time > may be short or long. > During double leader, it's not liearizability read. So kudu should avoid > double leader at any time, pay the corresponding cost is no leader at a small > period time. Kudu should make a choice. For user usally need linearizability, > so I think kudu should support it. During a very small time no leader's > unavailability can avoid by client's fault tolerance. > Whether read leader is linearizability read, someone can make sure it or I > can do a experiment. > h1. Solution > kudu should avoid double leaders at a very small period time and network > fault happens . I review the codes, and think now the problem is exist. > To avoid the double leader's trouble,leader should be keep alive. If a leader > receives no enough heartbeats in a period of time, it shoud be leader down > and and then start another election just like follower does. Leader's timeout > should be less than follower's election. > Another scheme: Read should send heartbeat to two follow to make sure it is > valid leader. -- This message was sent by Atlassian Jira (v8.20.10#820010)