Dmitry Konstantinov created CASSANDRA-21019:
-----------------------------------------------

             Summary: Memtable allocator: separate memory usage tracking and 
limit checking
                 Key: CASSANDRA-21019
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21019
             Project: Apache Cassandra
          Issue Type: Improvement
          Components: Local/Memtable
            Reporter: Dmitry Konstantinov
            Assignee: Dmitry Konstantinov


The following optimization idea has been suggested by @blambov in 
CASSANDRA-20226:

{quote}

There's another option to consider here. The allocation mechanism does not need 
to check the limit for individual cell writes. We could just as well track the 
usage of a mutation in a single {{allocate}} call after it completes, or track 
the allocations with a {{LongAdder}} without checking if the limit is hit, and 
check if we need to wait for room before starting to apply a mutation.

We use the {{allocate}} code to decide:
 - whether to initiate a flush, when the chosen memory limit is filled to some 
ratio
 - whether to pause accepting writes, when the chosen memory limit has been 
exhausted

For the former use there is absolutely no benefit to make these decisions at 
the individual allocation level, as we will wait for the mutation to complete 
anyway before flushing anything. For the latter, I'd argue that the 
allocation-level tracking is actually hurting us. The reason for this is that 
we can have the limit be hit at any time during the application of a mutation, 
holding multiple locks (which necessitates the complexity of the {{isBlocking}} 
mutation signal), a partial copy of the mutation already written to the 
memtable structures, and a likely expanded version of the mutation to be 
applied on heap, keeping hold of more total memory than we would if we allowed 
the operation to continue.

If, instead, we check the allocation limits _before starting_ a mutation and, 
once started, allow it to fully progress to completion, we can avoid this 
situation at the cost of being somewhat late to notice that the limit has been 
reached. This means that the limit will be breached, but this also happens as 
it stands now because we will permit operations to run to completion if the 
memtable they have been marked for is scheduled for a flush – which is 
effectively the same thing as not having noticed the memory limit would be 
breached by this mutation at the time when we decided to start it.

{quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to