[ https://issues.apache.org/jira/browse/IGNITE-24772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-24772: ---------------------------------- Summary: Data loss in in-memory group after several node restarts without losing majority of Ignite nodes in any moment of time (was: Data loss in in-memory group after several node restarts without losing majority in any moment of time) > Data loss in in-memory group after several node restarts without losing > majority of Ignite nodes in any moment of time > ---------------------------------------------------------------------------------------------------------------------- > > Key: IGNITE-24772 > URL: https://issues.apache.org/jira/browse/IGNITE-24772 > Project: Ignite > Issue Type: Bug > Reporter: Denis Chudov > Priority: Major > Labels: ignite-3 > > *Scenario:* > Nodes: A,B,C. > A is a leader. > Client writes some data, data is replicated to A and B, committed on the > leader (A) and the write operation succeeds from client's POV. > A fails, then returns to the cluster. No data saved on A because it is > in-memory. > The cluster tries to include A as a clean node, it tries to exclude it from > the configuration and include again, but configuration is not applied for > some time because there is no leader and may be some temporary network issues > preventing the write of new data. > Then the user (that thinks that the majority would be preserved) restarts > node B. It also loses the data. > Let's say that data wasn't even replicated on C. > As a result, the data is lost. > > *Ignite specifics:* > Before starting the node during the restart, it is removed from the > configuration and then included again. Actually, it is started only when > including it back. So the scenario will be slightly different: > When A is started, it is removed from the configuration. > Node B is stopped. Now the majority is lost and full group restart is > required. > User will need a group restart, while keeping {*}the majority of Ignite nodes > online{*}. It leads to the data loss. -- This message was sent by Atlassian Jira (v8.20.10#820010)