Denis Chudov created IGNITE-24817: ------------------------------------- Summary: Transactional guarantees become unreliable in multi-zone transactions including in-memory zones Key: IGNITE-24817 URL: https://issues.apache.org/jira/browse/IGNITE-24817 Project: Ignite Issue Type: Bug Reporter: Denis Chudov
In in-memory groups, we have persistent transaction state storage in order to keep the transaction state until we are sure that all write-intents are switched even in the case of group restart. However, the log still is in-memory, so if the log is not flushed and the commit partition group restarts or loses majority after transaction finish, and we were not able to switch write intents, we will lose data consistency. Consider the case: * transaction tx0 is started and two partitions are enlisted: 1_part_0 (in-memory zone) and 2_part_0 (another zone). 1_part_0 is a commit partition. * tx0 is committed. Right after that, commit partition group is restarted. Transaction state is lost. * We were not able to finish the cleanup phase of tx0, but for the user, it is committed (because the state was written into commit partition group before the restart). * coordinator of tx0 also leaves the cluster. * There is a write intent in 2_part_0, but we don't know the transaction state. According to the regular transaction recovery procedure, we will rollback this transaction. As a result, tx0 is committed but the changes that it had done in 2_part_0 are lost. In the same time, the raft log is single for each in-memory group, so tx state storage and partition storage share the same log and we can't make a persistent log for tx state storage. Also, the problem exists no matter which type of zone is the zone of second partition: in-memory or persistent. -- This message was sent by Atlassian Jira (v8.20.10#820010)