[jira] [Created] (CASSANDRA-20878) Improve Accord Observability

Benedict Elliott Smith (Jira) Tue, 02 Sep 2025 10:43:27 -0700

Benedict Elliott Smith created CASSANDRA-20878:
--------------------------------------------------


             Summary: Improve Accord Observability
                 Key: CASSANDRA-20878
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20878
             Project: Apache Cassandra
          Issue Type: Improvement
          Components: Accord
            Reporter: Benedict Elliott Smith


    Improve Observability:
     - Track all active Coordinations
     - Refactor Replica/Coordinator metrics and report Coordinator 
exhausted/preempted/timeout
     - DurabilityQueue metrics and visibility
    Also Fix:
     - WaitingState can get cause distributed stall when asked to wait for 
CanApply if not yet PreCommitted; track separate querying state and advance 
this to the next achievable state rather than the desired final state
     - Stalled coordinators should not prevent recovery
     - Edge case with fetch unable to make progress when pre-bootstrap and all 
peers have GC'd
     - Dependency initialisation for sync points across certain ownership 
changes
     - SyncPoint propagation may not include all of the epochs required on the 
receiving node for ranges they have lost but not closed, and receiving node 
does not validate them
     - Stable tracker accounting with LocalExecute
     - Do not prune non-durable APPLIED as must be reported in dependencies 
until durably applied (so as not to break recovery)
     - Ensure we cannot race with replies when initiating Coordination
     - ProgressLog does not guarantee to clear home or waiting states when 
erased or invalidated by compaction
     - WaitingState on non-home shard cannot guarantee progress once home shard 
is Erased
     - WaitingOnSync handles retired ranges incorrectly
    Also Improve:
     - Standardise failure accounting, use null to represent single reply 
timeouts
     - BurnTest record/replay to/from file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSANDRA-20878) Improve Accord Observability

Reply via email to