[jira] [Updated] (IGNITE-24436) Do not clean index on tx cleanup if no index info is available

Roman Puchkovskiy (Jira) Fri, 07 Feb 2025 03:11:01 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Roman Puchkovskiy updated IGNITE-24436:
---------------------------------------
    Description: 
Original problem is described in IGNITE-24368.

A couple of observations:
 # For write intents (WIs) of committed transactions, indexes are actually not 
relevant for cleanup (they are not used), so we should not obtain them at all
 # For WIs of aborted transactions, if we don't remove the tuple from the 
index, this will not cause any consistency problems, just some garbage will 
remain in the index. The garbage will consume disk space and it will slow down 
reads via the index, but that's it

So the idea of a quick and dirty fix is to do the following when doing write 
intent switch in PartitionReplicaListener and PartitionListener:
 # In StorageUpdateHandler, make last parameter (accepting a list of indexes) a 
Supplier
 # If the transaction is committed, don't call the supplier
 # In the supplier, call not the existing TableUtils#indexIdsAtRwTxBeginTs(), 
but a variant of it (it might be called indexIdsAtRwTxBeginTsOrEmpty()) which 
will not fail if there is no catalog version or it doesn't contain indexes for 
the table, but it would just return an empty list

But first, an integration test has to be written to make sure that analysis 
made in IGNITE-24368 was correct. The scenario is:
 # Start a cluster of 3 nodes with dataAvailabilityTime set to 1 second
 # Make sure transaction cleanups do not happen (the interval between cleanups 
can be configured probably)
 # Create a zone with 1 partition (not necessarily, might be default 25, but 
could be easier to debug with just 1) and 3 replicas
 # Create table A in the zone
 # Start an *explicit* transaction, make a put in it, commit the transaction
 # Create table B in the zone (to make the catalog version in which A was 
created not the freshest version)
 # Stop all nodes
 # Wait for dataAvailabilityTime to pass (1 second)
 # Start all 3 nodes and expect the start to fail

  was:
Original problem is described in IGNITE-24368.

A couple of observations:
 # For write intents (WIs) of committed transactions, indexes are actually not 
relevant for cleanup (they are not used), so we should not obtain them at all
 # For WIs of aborted transactions, if we don't remove the tuple from the 
index, this will not cause any consistency problems, just some garbage will 
remain in the index. The garbage will consume disk space and it will slow down 
reads via the index, but that's it

So the idea of a quick and dirty fix is to do the following when doing write 
intent switch in PartitionReplicaListener and PartitionListener:
 # In StorageUpdateHandler, make last parameter (accepting a list of indexes) a 
Supplier
 # If the transaction is committed, don't call the supplier
 # In the supplier, call not the existing TableUtils#indexIdsAtRwTxBeginTs(), 
but a variant of it (it might be called indexIdsAtRwTxBeginTsOrEmpty()) which 
will not fail if there is no catalog version or it doesn't contain indexes for 
the table, but it would just return an empty list

But first, an integration test has to be written to make sure that analysis 
made in IGNITE-24368 was correct. The scenario is:
 # Start a cluster of 3 nodes with dataAvailabilityTime set to 1 second
 # Make sure transaction cleanups do not happen (the interval between cleanups 
can be configured probably)
 # Create a zone with 1 partition (not necessarily, might be default 25, but 
could be easier to debug with just 1) and 3 replicas
 # Create table A in the zone
 # Start an *explicit* transaction, make a put in it, commit the transaction
 # Stop all nodes
 # Wait for dataAvailabilityTime to pass (1 second)
 # Start all 3 nodes and expect the start to fail


> Do not clean index on tx cleanup if no index info is available
> --------------------------------------------------------------
>
>                 Key: IGNITE-24436
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24436
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> Original problem is described in IGNITE-24368.
> A couple of observations:
>  # For write intents (WIs) of committed transactions, indexes are actually 
> not relevant for cleanup (they are not used), so we should not obtain them at 
> all
>  # For WIs of aborted transactions, if we don't remove the tuple from the 
> index, this will not cause any consistency problems, just some garbage will 
> remain in the index. The garbage will consume disk space and it will slow 
> down reads via the index, but that's it
> So the idea of a quick and dirty fix is to do the following when doing write 
> intent switch in PartitionReplicaListener and PartitionListener:
>  # In StorageUpdateHandler, make last parameter (accepting a list of indexes) 
> a Supplier
>  # If the transaction is committed, don't call the supplier
>  # In the supplier, call not the existing TableUtils#indexIdsAtRwTxBeginTs(), 
> but a variant of it (it might be called indexIdsAtRwTxBeginTsOrEmpty()) which 
> will not fail if there is no catalog version or it doesn't contain indexes 
> for the table, but it would just return an empty list
> But first, an integration test has to be written to make sure that analysis 
> made in IGNITE-24368 was correct. The scenario is:
>  # Start a cluster of 3 nodes with dataAvailabilityTime set to 1 second
>  # Make sure transaction cleanups do not happen (the interval between 
> cleanups can be configured probably)
>  # Create a zone with 1 partition (not necessarily, might be default 25, but 
> could be easier to debug with just 1) and 3 replicas
>  # Create table A in the zone
>  # Start an *explicit* transaction, make a put in it, commit the transaction
>  # Create table B in the zone (to make the catalog version in which A was 
> created not the freshest version)
>  # Stop all nodes
>  # Wait for dataAvailabilityTime to pass (1 second)
>  # Start all 3 nodes and expect the start to fail



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-24436) Do not clean index on tx cleanup if no index info is available

Reply via email to