[ 
https://issues.apache.org/jira/browse/IGNITE-24346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-24346:
---------------------------------------
    Description: 
Tx finish records stored in tx state storage have to be retained until all 
write intents corresponding to the finish record's transaction are cleaned up 
(either converted to normal tuple versions or removed). It is possible that, in 
the current implementation, we might erase such records too early when 
destroying a dropped table.

The suspected scenario is:
 # tx1 is started, it writes to tables A and B, the it gets committed, commit 
partition is a partition of A
 # Cleanup of tx1 is deferred for some reason (like high amount of work the 
cleanuper does)
 # A is dropped
 # After the moment of A's drop sinks under LWM, storages of A's partitions get 
destroyed (including the commit partition of tx1)
 # Cleanup of tx1 is attempted, but the tx state storage is destroyed, so the 
cleanup cannot proceedĀ  -> write intents of tx1 will remain forever unresolved

We need to check whether this is a possible scenario and, if it's possible, 
make sure that we keep tx state storages available for cleanup activities until 
all their transactions get cleaned up, even if this means deferring table 
partition destruction.

For per-zone tx state storages to which we are currently switching (see 
IGNITE-22621), nothing changes: a transaction might span multiple zones and 
then the zone hosting the commit partition might have been dropped and 
destroyed (IGNITE-24345).

> Do not destroy tx state storage while its content might be needed for write 
> intent resolution
> ---------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-24346
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24346
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> Tx finish records stored in tx state storage have to be retained until all 
> write intents corresponding to the finish record's transaction are cleaned up 
> (either converted to normal tuple versions or removed). It is possible that, 
> in the current implementation, we might erase such records too early when 
> destroying a dropped table.
> The suspected scenario is:
>  # tx1 is started, it writes to tables A and B, the it gets committed, commit 
> partition is a partition of A
>  # Cleanup of tx1 is deferred for some reason (like high amount of work the 
> cleanuper does)
>  # A is dropped
>  # After the moment of A's drop sinks under LWM, storages of A's partitions 
> get destroyed (including the commit partition of tx1)
>  # Cleanup of tx1 is attempted, but the tx state storage is destroyed, so the 
> cleanup cannot proceedĀ  -> write intents of tx1 will remain forever unresolved
> We need to check whether this is a possible scenario and, if it's possible, 
> make sure that we keep tx state storages available for cleanup activities 
> until all their transactions get cleaned up, even if this means deferring 
> table partition destruction.
> For per-zone tx state storages to which we are currently switching (see 
> IGNITE-22621), nothing changes: a transaction might span multiple zones and 
> then the zone hosting the commit partition might have been dropped and 
> destroyed (IGNITE-24345).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to