[ https://issues.apache.org/jira/browse/IGNITE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philipp Shergalis updated IGNITE-24436: --------------------------------------- Description: Original problem is described in IGNITE-24368. *What was done:* if catalog version, relevant at the time of transaction, is not available, then using indexes from the latest catalog Could not reproduce original scenario fast enough, created a simple unit-test instead. Test could be written in the ticket above was: Original problem is described in IGNITE-24368. A couple of observations: # For write intents (WIs) of committed transactions, indexes are actually not relevant for cleanup (they are not used), so we should not obtain them at all # For WIs of aborted transactions, if we don't remove the tuple from the index, this will not cause any consistency problems, just some garbage will remain in the index. The garbage will consume disk space and it will slow down reads via the index, but that's it So the idea of a quick and dirty fix is to do the following when doing write intent switch in PartitionReplicaListener and PartitionListener: # In StorageUpdateHandler, make last parameter (accepting a list of indexes) a Supplier # If the transaction is committed, don't call the supplier # In the supplier, call not the existing TableUtils#indexIdsAtRwTxBeginTs(), but a variant of it (it might be called indexIdsAtRwTxBeginTsOrEmpty()) which will not fail if there is no catalog version or it doesn't contain indexes for the table, but it would just return an empty list But first, an integration test has to be written to make sure that analysis made in IGNITE-24368 was correct. The scenario (parameterized with whether a transaction is committed or rolled back) is: # Start a cluster of 3 nodes with dataAvailabilityTime set to 1 second # Make sure transaction cleanups do not happen (the interval between cleanups can be configured probably) # Create a zone with 1 partition (not necessarily, might be default 25, but could be easier to debug with just 1) and 3 replicas # Create table A in the zone # Start an *explicit* transaction, make a put in it, commit/rollback # Create table B in the zone (to make the catalog version in which A was created not the freshest version) # Stop all nodes # Wait for dataAvailabilityTime to pass (1 second) # Start all 3 nodes and expect the start to fail After the fix is made, it would be great to also add a test that makes sure that if this happens for a rolled-back transaction, after a restart we can still make a put of the same key as the one that was rolled-back. > Do not clean index on tx cleanup if no index info is available > -------------------------------------------------------------- > > Key: IGNITE-24436 > URL: https://issues.apache.org/jira/browse/IGNITE-24436 > Project: Ignite > Issue Type: Improvement > Reporter: Roman Puchkovskiy > Assignee: Philipp Shergalis > Priority: Major > Labels: ignite-3 > Time Spent: 40m > Remaining Estimate: 0h > > Original problem is described in IGNITE-24368. > *What was done:* if catalog version, relevant at the time of transaction, is > not available, then using indexes from the latest catalog > > Could not reproduce original scenario fast enough, created a simple unit-test > instead. Test could be written in the ticket above -- This message was sent by Atlassian Jira (v8.20.10#820010)