[ 
https://issues.apache.org/jira/browse/IGNITE-24817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov updated IGNITE-24817:
----------------------------------
    Description: 
In in-memory groups, we have persistent transaction state storage in order to 
keep the transaction state until we are sure that all write-intents are 
switched even in the case of group restart. However, the log still is 
in-memory, so if the log is not flushed and the commit partition group restarts 
or loses majority after transaction finish, and we were not able to switch 
write intents, we will lose data consistency.

Consider the case:
 * transaction tx0 is started and two partitions are enlisted: 1_part_0 
(in-memory zone) and 2_part_0 (another zone). 1_part_0 is a commit partition.
 * tx0 is committed. Right after that, commit partition group is restarted. 
Transaction state is lost.
 * We were not able to finish the cleanup phase of tx0, but for the user, it is 
committed (because the state was written into commit partition group before the 
restart).
 * coordinator of tx0 also leaves the cluster.
 * There is a write intent in 2_part_0, but we don't know the transaction 
state. According to the regular transaction recovery procedure, we will 
rollback this transaction.

As a result, tx0 is committed but the changes that it had done in 2_part_0 are 
lost.

In the same time, the raft log is single for each in-memory group, so tx state 
storage and partition storage share the same log and we can't make a persistent 
log for tx state storage.

The problem exists no matter which type of zone is the zone of second 
partition: in-memory or persistent.

Also, the problem is relevant for the disaster recovery scenarios. Probably, we 
will need to stop the lease prolongation after performing resetPartition.

 

  was:
In in-memory groups, we have persistent transaction state storage in order to 
keep the transaction state until we are sure that all write-intents are 
switched even in the case of group restart. However, the log still is 
in-memory, so if the log is not flushed and the commit partition group restarts 
or loses majority after transaction finish, and we were not able to switch 
write intents, we will lose data consistency.

Consider the case:
 * transaction tx0 is started and two partitions are enlisted: 1_part_0 
(in-memory zone) and 2_part_0 (another zone). 1_part_0 is a commit partition.
 * tx0 is committed. Right after that, commit partition group is restarted. 
Transaction state is lost.
 * We were not able to finish the cleanup phase of tx0, but for the user, it is 
committed (because the state was written into commit partition group before the 
restart).
 * coordinator of tx0 also leaves the cluster.
 * There is a write intent in 2_part_0, but we don't know the transaction 
state. According to the regular transaction recovery procedure, we will 
rollback this transaction.

As a result, tx0 is committed but the changes that it had done in 2_part_0 are 
lost.

In the same time, the raft log is single for each in-memory group, so tx state 
storage and partition storage share the same log and we can't make a persistent 
log for tx state storage.

Also, the problem exists no matter which type of zone is the zone of second 
partition: in-memory or persistent.

 


> Transactional guarantees become unreliable in multi-zone transactions 
> including in-memory zones
> -----------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-24817
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24817
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>
> In in-memory groups, we have persistent transaction state storage in order to 
> keep the transaction state until we are sure that all write-intents are 
> switched even in the case of group restart. However, the log still is 
> in-memory, so if the log is not flushed and the commit partition group 
> restarts or loses majority after transaction finish, and we were not able to 
> switch write intents, we will lose data consistency.
> Consider the case:
>  * transaction tx0 is started and two partitions are enlisted: 1_part_0 
> (in-memory zone) and 2_part_0 (another zone). 1_part_0 is a commit partition.
>  * tx0 is committed. Right after that, commit partition group is restarted. 
> Transaction state is lost.
>  * We were not able to finish the cleanup phase of tx0, but for the user, it 
> is committed (because the state was written into commit partition group 
> before the restart).
>  * coordinator of tx0 also leaves the cluster.
>  * There is a write intent in 2_part_0, but we don't know the transaction 
> state. According to the regular transaction recovery procedure, we will 
> rollback this transaction.
> As a result, tx0 is committed but the changes that it had done in 2_part_0 
> are lost.
> In the same time, the raft log is single for each in-memory group, so tx 
> state storage and partition storage share the same log and we can't make a 
> persistent log for tx state storage.
> The problem exists no matter which type of zone is the zone of second 
> partition: in-memory or persistent.
> Also, the problem is relevant for the disaster recovery scenarios. Probably, 
> we will need to stop the lease prolongation after performing resetPartition.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to