[
https://issues.apache.org/jira/browse/IGNITE-27435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Scherbakov updated IGNITE-27435:
---------------------------------------
Description:
Currently explicit transaction is committed as soon as all write inflights are
completed.
This adds additional latency before beginning to commit, caused by waiting (see
TransactionInflights.ReadWriteTxContext#waitNoInflights).
This can be optimized in the following way:
1. Each inflight addtitionally tracks corresponding key in the tx inflight
context. A key is removed from the set as soon as inflight is replicated
2. We introduce addtitinal persisted tx state: COMMITTING
3. The a tx is started to commit, it synchronously replicates COMMITTING state
instead of COMMITED. Committing state additionally includes a list of keys (ack
list), which are being inflight just before commit starts.
4. Tx commit as acked to the user as soon as COMMITTING state is replicated and
all inflights are finished. This is "implicit commit".
5. As soon as tx is "implicitly committed", COMMITED state is replicated
asynchronously. After this, tx is considered "explicitely committed".
We need changes in write intent resolution logic on commit path.
When an write intent is encountered and tx is in COMMITTING state on commit
partition path, we need to attempt to resolve tx state.
If coordinator is alive, we try to wait some time for COMMITTED state - it
should come naturally by COMMITING->COMMITED step.
If coordinator is dead or timeout has passed, we try to resolve tx state by
validating successful replication of all keys from the ack list.
If multiple write intents are resolved, other are waiting for the first request
outcome.
Write intent remains unavailable until tx state is resolved.
If inflight keys hit a hard limit (defined by some configuration parameter),
this optimization is disabled.
was:
Currently explicit transaction is committed as soon as all write inflights are
completed.
This adds additional latency before beginning to commit, caused by waiting (see
TransactionInflights.ReadWriteTxContext#waitNoInflights).
This can be optimized in the following way:
1. Each inflight addtitionally tracks corresponding key in the tx inflight
context. A key is removed from the set as soon as inflight is replicated
2. We introduce addtitinal persisted tx state: COMMITTING
3. The a tx is started to commit, it synchronously replicates COMMITTING state
instead of COMMITED. Committing state additionally includes a list of keys (ack
list), which are being inflight just before commit starts.
4. Tx commit as acked to the user as soon as COMMITTING state is replicated and
all inflights are finished. This is "implicit commit".
5. As soon as tx is "implicitly committed", COMMITED state is replicated
asynchronously. After this, tx is considered "explicitely committed".
We need changes in tx recovery.
When an write intent is encountered and tx is in COMMITTING state on commit
partition path, we need to attempt to resolve tx state.
If coordinator is alive, we try to wait some time for COMMITTED state - it
should come naturally by COMMITING->COMMITED step.
If coordinator is dead or timeout has passed, we try to resolve tx state by
validating successful replication of all keys from the ack list.
If multiple write intents are resolved, other are waiting for the first request
outcome.
Write intent remains unavailable until tx state is resolved.
If inflight keys hit a hard limit (defined by some configuration parameter),
this optimization is disabled.
> Improve commit latency for explicit RW transactions
> ---------------------------------------------------
>
> Key: IGNITE-27435
> URL: https://issues.apache.org/jira/browse/IGNITE-27435
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexey Scherbakov
> Priority: Major
> Labels: ignite-3
> Fix For: 3.3
>
>
> Currently explicit transaction is committed as soon as all write inflights
> are completed.
> This adds additional latency before beginning to commit, caused by waiting
> (see TransactionInflights.ReadWriteTxContext#waitNoInflights).
> This can be optimized in the following way:
> 1. Each inflight addtitionally tracks corresponding key in the tx inflight
> context. A key is removed from the set as soon as inflight is replicated
> 2. We introduce addtitinal persisted tx state: COMMITTING
> 3. The a tx is started to commit, it synchronously replicates COMMITTING
> state instead of COMMITED. Committing state additionally includes a list of
> keys (ack list), which are being inflight just before commit starts.
> 4. Tx commit as acked to the user as soon as COMMITTING state is replicated
> and all inflights are finished. This is "implicit commit".
> 5. As soon as tx is "implicitly committed", COMMITED state is replicated
> asynchronously. After this, tx is considered "explicitely committed".
> We need changes in write intent resolution logic on commit path.
> When an write intent is encountered and tx is in COMMITTING state on commit
> partition path, we need to attempt to resolve tx state.
> If coordinator is alive, we try to wait some time for COMMITTED state - it
> should come naturally by COMMITING->COMMITED step.
> If coordinator is dead or timeout has passed, we try to resolve tx state by
> validating successful replication of all keys from the ack list.
> If multiple write intents are resolved, other are waiting for the first
> request outcome.
> Write intent remains unavailable until tx state is resolved.
> If inflight keys hit a hard limit (defined by some configuration parameter),
> this optimization is disabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)