Benedict Elliott Smith created CASSANDRA-19870:
--------------------------------------------------

             Summary: Accord: DefaultProgressLog
                 Key: CASSANDRA-19870
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19870
             Project: Cassandra
          Issue Type: Improvement
          Components: Accord
            Reporter: Benedict Elliott Smith


Redesign progress mechanisms to be memory efficient, use fewer messages and to 
resolve dependency chains promptly.

The SimpleProgressLog had a number of problems:

1. It polled for progress with no attempt to determine whether progress could 
realistically be made, so:
        - as the number of pending transactions grew, the proportion of useful 
work dropped (as many would be unable to make progress without earlier 
transactions completing)
        - each transaction in the chain could recover only on average 1/2 poll 
interval behind the last transaction to complete
2. It requested full transaction state from every replica on each attempt
3. It maintained a lot of in-memory state
4. Polling happened en-masse, allowing for little per-transaction control
We also separately maintained fairly expensive per-command listener state that 
negatively affected our command loading and caching.

The new DefaultProgressLog makes use of several new features: LocalListeners, 
RemoteListeners, Timers and Await messages.
- LocalListeners provide a memory-efficient collection for managing each 
CommandStore’s transaction listeners, with dedicated record keeping for 
inter-transaction relationships.
- RemoteListeners provide a mechanism for request/response pairs that may be 
separated by longer than the normal Cassandra message timeout, and require 
minimal state on sender and recipient. This permits replicas to cheaply update 
their local state machine as soon as distributed information becomes available.

The DefaultProgressLog tracks each transaction with separate timers to handle 
per-transaction scheduling, backoff etc, and a succinct state machine. To 
reduce overhead correspondence is preferentially limited to a handful of 
replicas, and limited to the home shard where appropriate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to