[ https://issues.apache.org/jira/browse/IGNITE-24104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-24104: ---------------------------------- Description: *Motivation* Quorum is chosen because it is quite straightforward for the user: if they are able to keep QUORUM_SIZE nodes in the zone, there should be no data loss, unless they lose multiple nodes simultaneously so that the quorum would be not able to transfer to nodes that remain online. It specifies the size of the majority quorum. In fact, the size of consensus group can be derived from it, depending on the replication algorithm, in the case of Raft it is calculated as QUORUM_SIZE * 2 - 1, so the quorum size is the size of the majority of nodes in the consensus of a replication group. This implies that the size of the consensus group will be an odd number, if there are enough replicas. The QUORUM_SIZE parameter may be set with any sufficient number of replicas, either ALL or not. This means that we can have, for example, 10 data nodes, 7 replicas and quorum size 3, meaning that 5 replicas will form the consensus group and 2 will be learners. The default value should be * If there are 4 or less data nodes: min(2, data_nodes_count); * If there are at least 5 data nodes: 3. Lower and upper boundaries: * Lower: 1 if there is only one replica and 2 if there is more than 1 node. Having the quorum of 1 node where there are more replicas makes no sense and decreases reliability; * Upper: no less than lower bound, but making the consensus group fit into the configured replicas count. *Definition of done* The QUORUM_SIZE parameter for the zone is added, with aforementioned defaults and boundaries. Unit tests for defaults and boundaries are added. It is also passed to PartitionDistributionUtils to calculate the assignments properly. was: *Motivation* Quorum is chosen because it is quite straightforward for the user: if they are able to keep QUORUM_SIZE nodes in the zone, there should be no data loss, unless they lose multiple nodes simultaneously so that the quorum would be not able to transfer to nodes that remain online. It specifies the size of the majority quorum. In fact, the size of consensus group can be derived from it, depending on the replication algorithm, in the case of Raft it is calculated as QUORUM_SIZE * 2 - 1, so the quorum size is the size of the majority of nodes in the consensus of a replication group. This implies that the size of the consensus group will be an odd number, if there are enough replicas. The QUORUM_SIZE parameter may be set with any sufficient number of replicas, either ALL or not. This means that we can have, for example, 10 data nodes, 7 replicas and quorum size 3, meaning that 5 replicas will form the consensus group and 2 will be learners. The default value should be * If there are 4 or less data nodes: min(2, data_nodes_count); * If there are at least 5 data nodes: 3. Lower and upper boundaries: * Lower: 1 if there is only one replica and 2 if there is more than 1 node. Having the quorum of 1 node where there are more replicas makes no sense and decreases reliability; * Upper: no less than lower bound, but making the consensus group fit into the configured replicas count. *Definition of done* The QUORUM_SIZE parameter for the zone is added, with aforementioned defaults and boundaries. Unit tests for defaults and boundaries are added. > Add the new distribution zone parameter QUORUM_SIZE > --------------------------------------------------- > > Key: IGNITE-24104 > URL: https://issues.apache.org/jira/browse/IGNITE-24104 > Project: Ignite > Issue Type: Improvement > Reporter: Denis Chudov > Assignee: Vadim Pakhnushev > Priority: Major > Labels: ignite-3 > > *Motivation* > Quorum is chosen because it is quite straightforward for the user: if they > are able to keep QUORUM_SIZE nodes in the zone, there should be no data loss, > unless they lose multiple nodes simultaneously so that the quorum would be > not able to transfer to nodes that remain online. > It specifies the size of the majority quorum. In fact, the size of consensus > group can be derived from it, depending on the replication algorithm, in the > case of Raft it is calculated as QUORUM_SIZE * 2 - 1, so the quorum size is > the size of the majority of nodes in the consensus of a replication group. > This implies that the size of the consensus group will be an odd number, if > there are enough replicas. > The QUORUM_SIZE parameter may be set with any sufficient number of replicas, > either ALL or not. This means that we can have, for example, 10 data nodes, 7 > replicas and quorum size 3, meaning that 5 replicas will form the consensus > group and 2 will be learners. > The default value should be > * If there are 4 or less data nodes: min(2, data_nodes_count); > * If there are at least 5 data nodes: 3. > Lower and upper boundaries: > * Lower: 1 if there is only one replica and 2 if there is more than 1 node. > Having the quorum of 1 node where there are more replicas makes no sense and > decreases reliability; > * Upper: no less than lower bound, but making the consensus group fit into > the configured replicas count. > *Definition of done* > The QUORUM_SIZE parameter for the zone is added, with aforementioned defaults > and boundaries. Unit tests for defaults and boundaries are added. > It is also passed to PartitionDistributionUtils to calculate the assignments > properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)