[
https://issues.apache.org/jira/browse/IGNITE-27435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Scherbakov updated IGNITE-27435:
---------------------------------------
Description:
Currently explicit transaction is committed as soon as all write inflights are
completed.
This adds additional latency before beginning to commit, caused by waiting (see
TransactionInflights.ReadWriteTxContext#waitNoInflights).
This can be optimized in the following way:
1. Each inflight addtitionally tracks corresponding key in the tx inflight
context. A key is removed from the set as soon as inglight is replicated
2. We introduce addtitinal persisted tx state: COMMITTING
3. The a tx is started to commit, it synchronously replicates COMMITTING state
instead of COMMITED. Committing state additionally includes a list of keys (ack
list), which are being inflight just before commit starts.
4. Tx commit as acked to the user as soon as COMMITTING state is replicated and
all inflights are finished. This is "implicit commit".
5. As soon as tx is "implicitly committed", COMMITED state is replicated
asynchronously. After this, tx is considered "explicitely committed".
We need changes in tx recovery. When an abandoned write intent is encountered
and tx is in COMMITTING state and corresponding key is in the ack list, we need
to check if it is fully replicated (by reading a value from majority and
validating this is same write intent for given tx id). If no, tx is rolled back.
If inflight keys hit a hard limit (defined by some configuration parameter),
this optimization is disabled.
was:
Currently explicit transaction is committed as soon as all write inflights are
completed.
This adds additional "tail" latency before beginning to commit.
This can be optimized in the following way:
1. Each inflight addtitionally tracks corresponding key in the tx inflight
context. A key is removed from the set as soon as inglight is replicated
2. We introduce addtitinal persisted tx state: COMMITTING
3. The a tx is started to commit, it synchronously replicates COMMITTING state
instead of COMMITED. Committing state additionally includes a list of keys (ack
list), which are being inflight just before commit starts.
4. Tx commit as acked to the user as soon as COMMITTING state is replicated and
all inflights are finished. This is "implicit commit".
5. As soon as tx is "implicitly committed", COMMITED state is replicated
asynchronously. After this, tx is considered "explicitely committed".
We need changes in tx recovery. When an abandoned write intent is encountered
and tx is in COMMITTING state and corresponding key is in the ack list, we need
to check if it is fully replicated. If no, tx is marked as rolled back.
If inflight keys hit a hard limit (defined by some configuration parameter),
this optimization is disabled.
> Improve commit latency for explicit RW transactions
> ---------------------------------------------------
>
> Key: IGNITE-27435
> URL: https://issues.apache.org/jira/browse/IGNITE-27435
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexey Scherbakov
> Priority: Major
> Labels: ignite-3
> Fix For: 3.3
>
>
> Currently explicit transaction is committed as soon as all write inflights
> are completed.
> This adds additional latency before beginning to commit, caused by waiting
> (see TransactionInflights.ReadWriteTxContext#waitNoInflights).
> This can be optimized in the following way:
> 1. Each inflight addtitionally tracks corresponding key in the tx inflight
> context. A key is removed from the set as soon as inglight is replicated
> 2. We introduce addtitinal persisted tx state: COMMITTING
> 3. The a tx is started to commit, it synchronously replicates COMMITTING
> state instead of COMMITED. Committing state additionally includes a list of
> keys (ack list), which are being inflight just before commit starts.
> 4. Tx commit as acked to the user as soon as COMMITTING state is replicated
> and all inflights are finished. This is "implicit commit".
> 5. As soon as tx is "implicitly committed", COMMITED state is replicated
> asynchronously. After this, tx is considered "explicitely committed".
> We need changes in tx recovery. When an abandoned write intent is encountered
> and tx is in COMMITTING state and corresponding key is in the ack list, we
> need to check if it is fully replicated (by reading a value from majority and
> validating this is same write intent for given tx id). If no, tx is rolled
> back.
> If inflight keys hit a hard limit (defined by some configuration parameter),
> this optimization is disabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)