[ https://issues.apache.org/jira/browse/IGNITE-20685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Lapin updated IGNITE-20685: ------------------------------------- Description: h3. Motivation Let's assume that the date node somehow found out that the transaction coordinator is dead, but the products of its activity such as locks and write intents are still present. In that case it’s necessary to check whether corresponding transaction was actually finished and if not finish it. h3. Definition of Done * Transactions X that detects (detection logic will be covered in a separate ticket) that coordinator is dead awaits commitPartition primary replica and sends initiateRecoveryReplicaRequest to it in a fully asynchronous manner. Meaning that transaction X should behave itself in a way as it specified in deadlock prevention engine and do not explicitly wait for initiateRecovery result. Actually we do not expect any direct response from initiate recovery. Initiate recovery failover will be implemented in a different way. * Commit partition somewhere handles given request. No-op handling is expected for now, proper one will be added in IGNITE-20735 Let's consider either TransactionStateResolver or TxManagerImpl as initiateRecovery handler. TransactionStateResolver seems as the best choice here, however it should be refactored a bit, basically because it's won't be only state resolver any longer. h3. Implementation Notes * Given ticket is trivial and should be considered as a bridge between durable tx coordinator liveness detection and corresponding initiateRecoveryReplicaRequest handling. Both items will be covered in a separate tickets. was: h3. Motivation Let's assume that the date node somehow found out that the transaction coordinator is dead, but the products of its activity such as locks and write intents are still present. In that case it’s necessary to check whether corresponding transaction was actually finished and if not finish it. h3. Definition of Done * Transactions X that detects (detection logic will be covered in a separate ticket) that coordinator is dead awaits commitPartition primary replica and sends initiateRecoveryReplicaRequest to it in a fully asynchronous manner. Meaning that transaction X should behave itself in a way as it specified in deadlock prevention engine and do not explicitly wait for initiateRecovery result. Actually we do not expect any direct response from initiate recovery. Initiate recovery failover will be implemented in a different way. * Commit partition somewhere handles given request. No-op handling is expected for now, proper one will be added in IGNITE-20735 h3. Implementation Notes Given ticket is trivial and should be considered as a bridge between durable tx coordinator liveness detection and corresponding initiateRecoveryReplicaRequest handling. Both items will be covered in a separate tickets. > Implement ability to trigger transaction recovery > ------------------------------------------------- > > Key: IGNITE-20685 > URL: https://issues.apache.org/jira/browse/IGNITE-20685 > Project: Ignite > Issue Type: Improvement > Reporter: Alexander Lapin > Priority: Major > > h3. Motivation > Let's assume that the date node somehow found out that the transaction > coordinator is dead, but the products of its activity such as locks and write > intents are still present. In that case it’s necessary to check whether > corresponding transaction was actually finished and if not finish it. > h3. Definition of Done > * Transactions X that detects (detection logic will be covered in a separate > ticket) that coordinator is dead awaits commitPartition primary replica and > sends initiateRecoveryReplicaRequest to it in a fully asynchronous manner. > Meaning that transaction X should behave itself in a way as it specified in > deadlock prevention engine and do not explicitly wait for initiateRecovery > result. Actually we do not expect any direct response from initiate recovery. > Initiate recovery failover will be implemented in a different way. > * Commit partition somewhere handles given request. No-op handling is > expected for now, proper one will be added in IGNITE-20735 Let's consider > either TransactionStateResolver or TxManagerImpl as initiateRecovery handler. > TransactionStateResolver seems as the best choice here, however it should be > refactored a bit, basically because it's won't be only state resolver any > longer. > h3. Implementation Notes > * Given ticket is trivial and should be considered as a bridge between > durable tx coordinator liveness detection and corresponding > initiateRecoveryReplicaRequest handling. Both items will be covered in a > separate tickets. > -- This message was sent by Atlassian Jira (v8.20.10#820010)