[ https://issues.apache.org/jira/browse/IGNITE-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Evgeny Stanilovsky updated IGNITE-24530: ---------------------------------------- Description: CPCC should not have an impact on RW transactions execution. It is proposed to do like: # Initiate new affinity replicator (new zone with new affinity) # After [1], get catalog activation time (time T1). # Wait all tx`s with beginTs < T1, need to start with IndexNodeFinishedRwTransactionsChecker or reuse it. # Change zone state (time T2) # Transactions store according to timestamps must meet the following conditions: ## tx`s with beginTs < T1 are directed into 'old' store\partition ## beginTs >= T1 are directed into both # Lets call - the task for already stored rows - store replicator. It stars with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < commitTs < T2. Thus transactions with startTs < T1 and as follows with commitTs < T2 are replicated (seems rows with not resolved intents need to be filtered too). Rows with T1 <= startTs < T2 will be copied twice - through store replicator and through affinity replicator (a bit write amplification here). Rows with startTs >= T2 will be replicated only through affinity replicator. # If tx coordinator is failed, tx can become in-flight i.e it`s commit can be already enlisted into execution queue on primary replica of tx commit partition, such tx can be commited *after* T2 and it wan`t be copied through affinity or store replicator. We should not allow to commit an RW transaction which is started before T1, but which tries to commit after T2, seems the same logic but for index purposes is described [2] check also [3]. [1] https://issues.apache.org/jira/browse/IGNITE-24442 [2] https://issues.apache.org/jira/browse/IGNITE-22990 [3] schemacompat.SchemaCompatibilityValidator was: CPCC should not have an impact on RW transactions execution. It is proposed to do like: # Initiate new affinity replicator (new zone with new affinity) # Change catalog version (time T1), seems it will be done the same time it was locked [1] # Wait all tx`s with beginTs < T1, need to start with IndexNodeFinishedRwTransactionsChecker or reuse it. # Change zone state (time T2) # Transactions store according to timestamps must meet the following conditions: ## tx`s with beginTs < T1 are directed into 'old' store\partition ## beginTs >= T1 are directed into both # Lets call - the task for already stored rows - store replicator. It stars with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < commitTs < T2. Thus transactions with startTs < T1 and as follows with commitTs < T2 are replicated (seems rows with not resolved intents need to be filtered too). Rows with T1 <= startTs < T2 will be copied twice - through store replicator and through affinity replicator (a bit write amplification here). Rows with startTs >= T2 will be replicated only through affinity replicator. # If tx coordinator is failed, tx can become in-flight i.e it`s commit can be already enlisted into execution queue on primary replica of tx commit partition, such tx can be commited *after* T2 and it wan`t be copied through affinity or store replicator. We should not allow to commit an RW transaction which is started before T1, but which tries to commit after T2, seems the same logic but for index purposes is described [2] check also [3]. [1] https://issues.apache.org/jira/browse/IGNITE-24442 [2] https://issues.apache.org/jira/browse/IGNITE-22990 [3] schemacompat.SchemaCompatibilityValidator > CPCC. Provide correct implementation for new affinity replication switch > ------------------------------------------------------------------------ > > Key: IGNITE-24530 > URL: https://issues.apache.org/jira/browse/IGNITE-24530 > Project: Ignite > Issue Type: Task > Components: sql > Affects Versions: 3.0 > Reporter: Evgeny Stanilovsky > Priority: Major > Labels: ignite-3 > > CPCC should not have an impact on RW transactions execution. It is proposed > to do like: > # Initiate new affinity replicator (new zone with new affinity) > # After [1], get catalog activation time (time T1). > # Wait all tx`s with beginTs < T1, need to start with > IndexNodeFinishedRwTransactionsChecker or reuse it. > # Change zone state (time T2) > # Transactions store according to timestamps must meet the following > conditions: > ## tx`s with beginTs < T1 are directed into 'old' store\partition > ## beginTs >= T1 are directed into both > # Lets call - the task for already stored rows - store replicator. It stars > with T3 > T2. It need to copy all rows which satisfy to predicate: T1 < > commitTs < T2. Thus transactions with startTs < T1 and as follows with > commitTs < T2 are replicated (seems rows with not resolved intents need to be > filtered too). Rows with T1 <= startTs < T2 will be copied twice - through > store replicator and through affinity replicator (a bit write amplification > here). Rows with startTs >= T2 will be replicated only through affinity > replicator. > # If tx coordinator is failed, tx can become in-flight i.e it`s commit can be > already enlisted into execution queue on primary replica of tx commit > partition, such tx can be commited *after* T2 and it wan`t be copied through > affinity or store replicator. We should not allow to commit an RW transaction > which is started before T1, but which tries to commit after T2, seems the > same logic but for index purposes is described [2] check also [3]. > [1] https://issues.apache.org/jira/browse/IGNITE-24442 > [2] https://issues.apache.org/jira/browse/IGNITE-22990 > [3] schemacompat.SchemaCompatibilityValidator -- This message was sent by Atlassian Jira (v8.20.10#820010)