Dmitry Konstantinov created CASSANDRA-21019:
-----------------------------------------------
Summary: Memtable allocator: separate memory usage tracking and
limit checking
Key: CASSANDRA-21019
URL: https://issues.apache.org/jira/browse/CASSANDRA-21019
Project: Apache Cassandra
Issue Type: Improvement
Components: Local/Memtable
Reporter: Dmitry Konstantinov
Assignee: Dmitry Konstantinov
The following optimization idea has been suggested by @blambov in
CASSANDRA-20226:
{quote}
There's another option to consider here. The allocation mechanism does not need
to check the limit for individual cell writes. We could just as well track the
usage of a mutation in a single {{allocate}} call after it completes, or track
the allocations with a {{LongAdder}} without checking if the limit is hit, and
check if we need to wait for room before starting to apply a mutation.
We use the {{allocate}} code to decide:
- whether to initiate a flush, when the chosen memory limit is filled to some
ratio
- whether to pause accepting writes, when the chosen memory limit has been
exhausted
For the former use there is absolutely no benefit to make these decisions at
the individual allocation level, as we will wait for the mutation to complete
anyway before flushing anything. For the latter, I'd argue that the
allocation-level tracking is actually hurting us. The reason for this is that
we can have the limit be hit at any time during the application of a mutation,
holding multiple locks (which necessitates the complexity of the {{isBlocking}}
mutation signal), a partial copy of the mutation already written to the
memtable structures, and a likely expanded version of the mutation to be
applied on heap, keeping hold of more total memory than we would if we allowed
the operation to continue.
If, instead, we check the allocation limits _before starting_ a mutation and,
once started, allow it to fully progress to completion, we can avoid this
situation at the cost of being somewhat late to notice that the limit has been
reached. This means that the limit will be breached, but this also happens as
it stands now because we will permit operations to run to completion if the
memtable they have been marked for is scheduled for a flush – which is
effectively the same thing as not having noticed the memory limit would be
breached by this mutation at the time when we decided to start it.
{quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]