Denis Chudov created IGNITE-24817:
-------------------------------------

             Summary: Transactional guarantees become unreliable in multi-zone 
transactions including in-memory zones
                 Key: IGNITE-24817
                 URL: https://issues.apache.org/jira/browse/IGNITE-24817
             Project: Ignite
          Issue Type: Bug
            Reporter: Denis Chudov


In in-memory groups, we have persistent transaction state storage in order to 
keep the transaction state until we are sure that all write-intents are 
switched even in the case of group restart. However, the log still is 
in-memory, so if the log is not flushed and the commit partition group restarts 
or loses majority after transaction finish, and we were not able to switch 
write intents, we will lose data consistency.

Consider the case:
 * transaction tx0 is started and two partitions are enlisted: 1_part_0 
(in-memory zone) and 2_part_0 (another zone). 1_part_0 is a commit partition.
 * tx0 is committed. Right after that, commit partition group is restarted. 
Transaction state is lost.
 * We were not able to finish the cleanup phase of tx0, but for the user, it is 
committed (because the state was written into commit partition group before the 
restart).
 * coordinator of tx0 also leaves the cluster.
 * There is a write intent in 2_part_0, but we don't know the transaction 
state. According to the regular transaction recovery procedure, we will 
rollback this transaction.

As a result, tx0 is committed but the changes that it had done in 2_part_0 are 
lost.

In the same time, the raft log is single for each in-memory group, so tx state 
storage and partition storage share the same log and we can't make a persistent 
log for tx state storage.

Also, the problem exists no matter which type of zone is the zone of second 
partition: in-memory or persistent.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to