[ https://issues.apache.org/jira/browse/IGNITE-24343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Lapin updated IGNITE-24343: ------------------------------------- Description: h3. Motivation Side note: we are not consistent in what commitPartitionId means: usually we assume that it's tableId + partitionIndex however in storage API, e.g. in PartitionDataStorage#addWrite commitPartitionId itself means partition index along with dedicated commitTableId. Below I will assume that commitPartitionId is a combination of table/zoneId and partitionIndex. * CommitPartition is a partition where where persisted txnState is stored. The state is atomically switched within TxFinish transaction phase. * CommitPartitionId is used in order to evaluate writeIntents through commitPartition path. * Write Intent(WI) resolution is a process that allows to define whether WI should be considered as applied (either committed or aborted) or ignored because corresponding transaction is in PENDING state. Simply, for WI(txId) ** Tx(txId).isPending() -> ignore WI (return previous value, or null if there's no such value). ** Tx(txId).isAborted() -> ignore WI (return previous value, or null if there's no such value). ** Tx(txId).isCommited() -> consider WI as a common value. * There are 3 different pathes of WI Resolution: ** Local path. Every node that participates in tx flow has volatile txnStateMap, thus it worth checking locally whether we already know the state of corresponding transaction. ** If local path is not applicable we will use Coordinator Path by requesting txn state from cooresponding coordinator. ** If Coordinator Path also not applicable we will use Commit Partition Path. Hence it's required to know where commit partition is. ** For more details please check [IEP-91|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211885498#IEP91:Transactionprotocol-ROtransactions.1] Worth mentioning that IEP is a bit outdated and both RO and RW transaction may see WI and thus do WI Resolution. Seems that we are done with [taxidermy|https://en.wikipedia.org/wiki/Taxonomy]. h3. Definition of Done * Within collocation track we do store zoneId + partitionIndex as commitPartitionId instead of tablerId + partitionIndex. * WI resolution works on top of ZonePartitionId as commitPartitionId. h3. Implementation Notes * As an entry point we may start with PartitionDataStorage#addWrite. Basically depending on whether it's collocation track or common one it's possible to propagate either zoneId or tableId as int without updating internal storage structure. * That means that UpdateCommand.tablePartitionId should be adjusted or extended with zonePartitionId, that required to adjust ReadWriteReplicaRequest.commitPartitionId in a similar manner which on its turn requires to adjust InternalTransaction.commitPartition. Long story short, we will find ourselves in methods like org.apache.ignite.internal.sql.engine.prepare.IgniteRelShuttle#enlist(org.apache.ignite.internal.sql.engine.rel.SourceAwareIgniteRel) where we retrieve tableId from relation int tableId = rel.getTable().unwrap(IgniteTable.class).id(); Seems that we have two options here, either propagate zoneId from the very beginning of the stack or add Catalog dependency to ExecutionService and similar components in order to convert tableId to zoneId in idempotent manner. I mean that in theory it's possible to move tables between zones, thus it's required to be consistent in defining zoneId by tableId, e.g. by requesting catalog on tx start time. * On the other hand ReadResult has int commitTableId that should be renamed to commitZoneId or some generalized option. * Corresponding changes in WI flow should also be implemented. was: h3. Motivation Side note: we are not consistent in what commitPartitionId means: usually we assume that it's tableId + partitionIndex however in storage API, e.g. in PartitionDataStorage#addWrite commitPartitionId itself means partition index along with dedicated commitTableId. Below I will assume that commitPartitionId is a combination of table/zoneId and partitionIndex. * CommitPartition is a partition where where persisted txnState is stored. The state is atomically switched within TxFinish transaction phase. * CommitPartitionId is used in order to evaluate writeIntents through commitPartition path. * Write Intent(WI) resolution is a process that allows to define whether WI should be considered as applied (either committed or aborted) or ignored because corresponding transaction is in PENDING state. Simply, for WI(txId) ** Tx(txId).isPending() -> ignore WI (return previous value, or null if there's no such value). ** Tx(txId).isAborted() -> ignore WI (return previous value, or null if there's no such value). ** Tx(txId).isCommited() -> consider WI as a common value. * There are 3 different pathes of WI Resolution: ** Local path. Every node that participates in tx flow has volatile txnStateMap, thus it worth checking locally whether we already know the state of corresponding transaction. ** If local path is not applicable we will use Coordinator Path by requesting txn state from cooresponding coordinator. ** If Coordinator Path also not applicable we will use Commit Partition Path. Hence it's required to know where commit partition is. ** For more details please check [IEP-91|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211885498#IEP91:Transactionprotocol-ROtransactions.1] Worth mentioning that IEP is a bit outdated and both RO and RW transaction may see WI and thus do WI Resolution. Seems that we are done with [taxidermy|https://en.wikipedia.org/wiki/Taxonomy]. h3. Definition of Done * Within collocation track we do store zoneId + partitionIndex as commitPartitionId instead of tablerId + partitionIndex. * WI resolution works on top of ZonePartitionId as commitPartitionId. > Use ZonePartitionId as commitPartitionId > ---------------------------------------- > > Key: IGNITE-24343 > URL: https://issues.apache.org/jira/browse/IGNITE-24343 > Project: Ignite > Issue Type: Improvement > Reporter: Alexander Lapin > Priority: Major > Labels: ignite-3 > > h3. Motivation > Side note: we are not consistent in what commitPartitionId means: usually we > assume that it's tableId + partitionIndex however in storage API, e.g. in > PartitionDataStorage#addWrite commitPartitionId itself means partition index > along with dedicated > commitTableId. Below I will assume that commitPartitionId is a combination of > table/zoneId and partitionIndex. > * CommitPartition is a partition where where persisted txnState is stored. > The state is atomically switched within TxFinish transaction phase. > * CommitPartitionId is used in order to evaluate writeIntents through > commitPartition path. > * Write Intent(WI) resolution is a process that allows to define whether WI > should be considered as applied (either committed or aborted) or ignored > because corresponding transaction is in PENDING state. Simply, for WI(txId) > ** Tx(txId).isPending() -> ignore WI (return previous value, or null if > there's no such value). > ** Tx(txId).isAborted() -> ignore WI (return previous value, or null if > there's no such value). > ** Tx(txId).isCommited() -> consider WI as a common value. > * There are 3 different pathes of WI Resolution: > ** Local path. Every node that participates in tx flow has volatile > txnStateMap, thus it worth checking locally whether we already know the state > of corresponding transaction. > ** If local path is not applicable we will use Coordinator Path by > requesting txn state from cooresponding coordinator. > ** If Coordinator Path also not applicable we will use Commit Partition > Path. Hence it's required to know where commit partition is. > ** For more details please check > [IEP-91|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211885498#IEP91:Transactionprotocol-ROtransactions.1] > Worth mentioning that IEP is a bit outdated and both RO and RW transaction > may see WI and thus do WI Resolution. > > Seems that we are done with > [taxidermy|https://en.wikipedia.org/wiki/Taxonomy]. > h3. Definition of Done > * Within collocation track we do store zoneId + partitionIndex as > commitPartitionId instead of tablerId + partitionIndex. > * WI resolution works on top of ZonePartitionId as commitPartitionId. > h3. Implementation Notes > * As an entry point we may start with PartitionDataStorage#addWrite. > Basically depending on whether it's collocation track or common one it's > possible to propagate either zoneId or tableId as int without updating > internal storage structure. > * That means that UpdateCommand.tablePartitionId should be adjusted or > extended with zonePartitionId, that required to adjust > ReadWriteReplicaRequest.commitPartitionId in a similar manner which on its > turn requires to adjust InternalTransaction.commitPartition. Long story > short, we will find ourselves in methods like > org.apache.ignite.internal.sql.engine.prepare.IgniteRelShuttle#enlist(org.apache.ignite.internal.sql.engine.rel.SourceAwareIgniteRel) > where we retrieve tableId from relation > int tableId = rel.getTable().unwrap(IgniteTable.class).id(); Seems that we > have two options here, either propagate zoneId from the very beginning of the > stack or add Catalog dependency to ExecutionService and similar components in > order to convert tableId to zoneId in idempotent manner. I mean that in > theory it's possible to move tables between zones, thus it's required to be > consistent in defining zoneId by tableId, e.g. by requesting catalog on tx > start time. > * On the other hand ReadResult has int commitTableId that should be renamed > to commitZoneId or some generalized option. > * Corresponding changes in WI flow should also be implemented. -- This message was sent by Atlassian Jira (v8.20.10#820010)