[jira] [Updated] (IGNITE-24530) CPCC. Provide correct implementation for new affinity replication switch

Evgeny Stanilovsky (Jira) Tue, 18 Feb 2025 05:16:08 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Evgeny Stanilovsky updated IGNITE-24530:
----------------------------------------
    Description: 
CPCC should not have an impact on RW transactions execution. It is proposed to 
do like:
# Initiate new affinity replicator (new zone with new affinity)
# After [1], get catalog activation time (time T1).
# Wait all tx`s with beginTs < T1, improve implementation 
IndexNodeFinishedRwTransactionsChecker or reuse it.
# Change zone state (time T2)
# Transactions store according to timestamps must meet the following conditions:
## tx`s with beginTs < T1 are directed into 'old' store\partition
## beginTs >= T1 are directed into both
# Lets call - the task for already stored rows - store replicator. It stars 
with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < 
commitTs < T2. Thus transactions with startTs < T1 and as follows with commitTs 
< T2 are replicated (seems rows with not resolved intents need to be filtered 
too). Rows with T1 <= startTs < T2 will be copied twice - through store 
replicator and through affinity replicator (a bit write amplification here). 
Rows with startTs >= T2 will be replicated only through affinity replicator. 
# If tx coordinator is failed, tx can become in-flight i.e it`s commit can be 
already enlisted into execution queue on primary replica of tx commit 
partition, such tx can be commited *after* T2 and it wan`t be copied through 
affinity or store replicator. We should not allow to commit an RW transaction 
which is started before T1, but which tries to commit after T2, seems the same 
logic but for index purposes is described [2] check also [3].

[1] https://issues.apache.org/jira/browse/IGNITE-24442
[2] https://issues.apache.org/jira/browse/IGNITE-22990
[3] schemacompat.SchemaCompatibilityValidator

  was:
CPCC should not have an impact on RW transactions execution. It is proposed to 
do like:
# Initiate new affinity replicator (new zone with new affinity)
# After [1], get catalog activation time (time T1).
# Wait all tx`s with beginTs < T1, need to start with 
IndexNodeFinishedRwTransactionsChecker or reuse it.
# Change zone state (time T2)
# Transactions store according to timestamps must meet the following conditions:
## tx`s with beginTs < T1 are directed into 'old' store\partition
## beginTs >= T1 are directed into both
# Lets call - the task for already stored rows - store replicator. It stars 
with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < 
commitTs < T2. Thus transactions with startTs < T1 and as follows with commitTs 
< T2 are replicated (seems rows with not resolved intents need to be filtered 
too). Rows with T1 <= startTs < T2 will be copied twice - through store 
replicator and through affinity replicator (a bit write amplification here). 
Rows with startTs >= T2 will be replicated only through affinity replicator. 
# If tx coordinator is failed, tx can become in-flight i.e it`s commit can be 
already enlisted into execution queue on primary replica of tx commit 
partition, such tx can be commited *after* T2 and it wan`t be copied through 
affinity or store replicator. We should not allow to commit an RW transaction 
which is started before T1, but which tries to commit after T2, seems the same 
logic but for index purposes is described [2] check also [3].

[1] https://issues.apache.org/jira/browse/IGNITE-24442
[2] https://issues.apache.org/jira/browse/IGNITE-22990
[3] schemacompat.SchemaCompatibilityValidator


> CPCC. Provide correct implementation for new affinity replication switch
> ------------------------------------------------------------------------
>
>                 Key: IGNITE-24530
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24530
>             Project: Ignite
>          Issue Type: Task
>          Components: sql
>    Affects Versions: 3.0
>            Reporter: Evgeny Stanilovsky
>            Priority: Major
>              Labels: ignite-3
>
> CPCC should not have an impact on RW transactions execution. It is proposed 
> to do like:
> # Initiate new affinity replicator (new zone with new affinity)
> # After [1], get catalog activation time (time T1).
> # Wait all tx`s with beginTs < T1, improve implementation 
> IndexNodeFinishedRwTransactionsChecker or reuse it.
> # Change zone state (time T2)
> # Transactions store according to timestamps must meet the following 
> conditions:
> ## tx`s with beginTs < T1 are directed into 'old' store\partition
> ## beginTs >= T1 are directed into both
> # Lets call - the task for already stored rows - store replicator. It stars 
> with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < 
> commitTs < T2. Thus transactions with startTs < T1 and as follows with 
> commitTs < T2 are replicated (seems rows with not resolved intents need to be 
> filtered too). Rows with T1 <= startTs < T2 will be copied twice - through 
> store replicator and through affinity replicator (a bit write amplification 
> here). Rows with startTs >= T2 will be replicated only through affinity 
> replicator. 
> # If tx coordinator is failed, tx can become in-flight i.e it`s commit can be 
> already enlisted into execution queue on primary replica of tx commit 
> partition, such tx can be commited *after* T2 and it wan`t be copied through 
> affinity or store replicator. We should not allow to commit an RW transaction 
> which is started before T1, but which tries to commit after T2, seems the 
> same logic but for index purposes is described [2] check also [3].
> [1] https://issues.apache.org/jira/browse/IGNITE-24442
> [2] https://issues.apache.org/jira/browse/IGNITE-22990
> [3] schemacompat.SchemaCompatibilityValidator



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-24530) CPCC. Provide correct implementation for new affinity replication switch

Reply via email to