[ 
https://issues.apache.org/jira/browse/CASSANDRA-21018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Bohn updated CASSANDRA-21018:
------------------------------------
    Description: 
h2. *Summary*

In Apache Cassandra *4.1.10* an assertion error occurs reproducibly when 
executing a {{DELETE}} statement that produces a {*}range tombstone on the 
clustering key{*}.
The crash happens inside the write path while applying the mutation on 
replicas, specifically in {{ByteBufferCloner.clone()}} invoked from 
{{{}RangeTombstoneList.clone(){}}}.

The problem persists:
 * after {{nodetool scrub}} on all affected tables,
 * after deleting all hints,
 * even after completely *dropping and recreating* the keyspaces and tables.

It only appears when executing a {{{}DELETE WHERE <partition key> AND 
<clustering key> >= ?{}}}.
Removing this DELETE from the workload makes the issue disappear completely, 
even under heavy concurrent writes.

This suggests a *bug in the handling of range tombstones* in the write path of 
Cassandra 4.1.x.
h2. *Environment*
 * Cassandra version: *4.1.10*
 * 3-node cluster, RF = 3
 * No hints pending
 * Reproducible with clean keyspace & empty datasets
 * Client performs parallel INSERTs and one DELETE with a clustering key range

h2. *Table Schema*
{code:java}
CREATE TABLE kkav3.ranked_kka_products (
  left text,
  rank int,
  kka_factor double,
  kka_factor_boosted double,
  left_factor double,
  left_factor_boosted double,
  left_picks int,
  right text,
  right_factor double,
  right_factor_boosted double,
  right_picks int,
  shared_picks int,
  shared_picks_boosted int,
  calculation_period int,
  last_update bigint,
  PRIMARY KEY ((left), rank)
) WITH CLUSTERING ORDER BY (rank ASC); {code}
h2. *Queries that trigger the bug*

Works fine alone:
{code:java}
INSERT INTO kkav3.ranked_kka_products (...) VALUES (...);{code}
Triggers the bug (range tombstone on clustering key):
{code:java}
DELETE FROM kkav3.ranked_kka_products WHERE left = ? AND rank >= ?;{code}
As soon as multiple clients insert rows and execute the above DELETE 
concurrently, Cassandra nodes throw an AssertionError inside the mutation 
processing path.
h2. *Stacktrace*
{code:java}
ERROR [MutationStage-6] 2025-11-13 11:48:24,335 JVMStabilityInspector.java:68 - 
Exception in thread 
Thread[MutationStage-6,10,SharedPool]java.lang.RuntimeException: 
java.lang.AssertionError   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:108)   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)   at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
   at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)   at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
java.lang.AssertionError: null   at 
org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:73)
   at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:131)   
at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:120)   
at 
org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:91)  
 at 
org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:33)  
 at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDeltaInternal(AtomicBTreePartition.java:132)
   at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:192)
   at 
org.apache.cassandra.db.memtable.SkipListMemtable.put(SkipListMemtable.java:135)
   at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1424)   
at 
org.apache.cassandra.db.CassandraTableWriteHandler.write(CassandraTableWriteHandler.java:40)
   at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:672)   at 
org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:489)   at 
org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:228)   at 
org.apache.cassandra.hints.Hint.applyFuture(Hint.java:109)   at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:116)   
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)   ... 6 common 
frames omitted{code}
 
h2. *Observed Behavior*
 * Node logs show *AssertionError* during mutation apply.
 * Tracing reports failing mutations with {{{}FAILURE_RSP{}}}.
 * Only operations involving the range delete ({{{}rank >= ?{}}}) cause the 
assertion.
 * Removing the DELETE eliminates the issue entirely.

h2. *Expected Behavior*
 * Range deletes on clustering key ({{{}DELETE ... AND rank >= ?{}}}) should be 
processed safely.
 * No assertion failures during normal write path operations.
 * Tombstone handling in {{RangeTombstoneList.clone()}} should not produce null 
buffer conditions.

  was:
h2. *Summary*
In Apache Cassandra *4.1.10* an assertion error occurs reproducibly when 
executing a {{DELETE}} statement that produces a {*}range tombstone on the 
clustering key{*}.
The crash happens inside the write path while applying the mutation on 
replicas, specifically in {{ByteBufferCloner.clone()}} invoked from 
{{{}RangeTombstoneList.clone(){}}}.

The problem persists:
 * after {{nodetool scrub}} on all affected tables,
 * after deleting all hints,
 * even after completely *dropping and recreating* the keyspaces and tables.

It only appears when executing a {{{}DELETE WHERE <partition key> AND 
<clustering key> >= ?{}}}.
Removing this DELETE from the workload makes the issue disappear completely, 
even under heavy concurrent writes.

This suggests a *bug in the handling of range tombstones* in the write path of 
Cassandra 4.1.x.

h2. *Environment*
 * Cassandra version: *4.1.10*
 * 3-node cluster, RF = 3
 * No hints pending
 * Reproducible with clean keyspace & empty datasets
 * Client performs parallel INSERTs and one DELETE with a clustering key range

h2. *Table Schema*
{code:java}
CREATE TABLE kkav3.ranked_kka_products (
  left text,
  rank int,
  kka_factor double,
  kka_factor_boosted double,
  left_factor double,
  left_factor_boosted double,
  left_picks int,
  right text,
  right_factor double,
  right_factor_boosted double,
  right_picks int,
  shared_picks int,
  shared_picks_boosted int,
  calculation_period int,
  last_update bigint,
  PRIMARY KEY ((left), rank)
) WITH CLUSTERING ORDER BY (rank ASC); {code}

h2. *Queries that trigger the bug*
Works fine alone:
{code:java}INSERT INTO kkav3.ranked_kka_products (...) VALUES (...);{code}
Triggers the bug (range tombstone on clustering key):
{code:java}DELETE FROM kkav3.ranked_kka_products WHERE left = ? AND rank >= 
?;{code}
As soon as multiple clients insert rows and execute the above DELETE 
concurrently, Cassandra nodes throw an AssertionError inside the mutation 
processing path.

h2. *Stacktrace*
{code:java}
ERROR [MutationStage-6] 2025-11-13 11:48:24,335 JVMStabilityInspector.java:68 - 
Exception in thread 
Thread[MutationStage-6,10,SharedPool]java.lang.RuntimeException: 
java.lang.AssertionError   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:108)   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)   at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
   at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)   at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
java.lang.AssertionError: null   at 
org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:73)
   at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:131)   
at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:120)   
at 
org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:91)  
 at 
org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:33)  
 at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDeltaInternal(AtomicBTreePartition.java:132)
   at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:192)
   at 
org.apache.cassandra.db.memtable.SkipListMemtable.put(SkipListMemtable.java:135)
   at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1424)   
at 
org.apache.cassandra.db.CassandraTableWriteHandler.write(CassandraTableWriteHandler.java:40)
   at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:672)   at 
org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:489)   at 
org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:228)   at 
org.apache.cassandra.hints.Hint.applyFuture(Hint.java:109)   at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:116)   
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)   at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)   ... 6 common 
frames omitted{code}
 
h2. *Observed Behavior*
 * Node logs show *AssertionError* during mutation apply.
 * Tracing reports failing mutations with {{{}FAILURE_RSP{}}}.
 * Only operations involving the range delete ({{{}rank >= ?{}}}) cause the 
assertion.
 * Removing the DELETE eliminates the issue entirely.

h2. *Expected Behavior*
 * Range deletes on clustering key ({{{}DELETE ... AND rank >= ?{}}}) should be 
processed safely.
 * No assertion failures during normal write path operations.
 * Tombstone handling in {{RangeTombstoneList.clone()}} should not produce null 
buffer conditions.


> AssertionError in ByteBufferCloner.clone due to Range Tombstone on clustering 
> key DELETE (4.1.10)
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21018
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21018
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Tobias Bohn
>            Priority: Normal
>
> h2. *Summary*
> In Apache Cassandra *4.1.10* an assertion error occurs reproducibly when 
> executing a {{DELETE}} statement that produces a {*}range tombstone on the 
> clustering key{*}.
> The crash happens inside the write path while applying the mutation on 
> replicas, specifically in {{ByteBufferCloner.clone()}} invoked from 
> {{{}RangeTombstoneList.clone(){}}}.
> The problem persists:
>  * after {{nodetool scrub}} on all affected tables,
>  * after deleting all hints,
>  * even after completely *dropping and recreating* the keyspaces and tables.
> It only appears when executing a {{{}DELETE WHERE <partition key> AND 
> <clustering key> >= ?{}}}.
> Removing this DELETE from the workload makes the issue disappear completely, 
> even under heavy concurrent writes.
> This suggests a *bug in the handling of range tombstones* in the write path 
> of Cassandra 4.1.x.
> h2. *Environment*
>  * Cassandra version: *4.1.10*
>  * 3-node cluster, RF = 3
>  * No hints pending
>  * Reproducible with clean keyspace & empty datasets
>  * Client performs parallel INSERTs and one DELETE with a clustering key range
> h2. *Table Schema*
> {code:java}
> CREATE TABLE kkav3.ranked_kka_products (
>   left text,
>   rank int,
>   kka_factor double,
>   kka_factor_boosted double,
>   left_factor double,
>   left_factor_boosted double,
>   left_picks int,
>   right text,
>   right_factor double,
>   right_factor_boosted double,
>   right_picks int,
>   shared_picks int,
>   shared_picks_boosted int,
>   calculation_period int,
>   last_update bigint,
>   PRIMARY KEY ((left), rank)
> ) WITH CLUSTERING ORDER BY (rank ASC); {code}
> h2. *Queries that trigger the bug*
> Works fine alone:
> {code:java}
> INSERT INTO kkav3.ranked_kka_products (...) VALUES (...);{code}
> Triggers the bug (range tombstone on clustering key):
> {code:java}
> DELETE FROM kkav3.ranked_kka_products WHERE left = ? AND rank >= ?;{code}
> As soon as multiple clients insert rows and execute the above DELETE 
> concurrently, Cassandra nodes throw an AssertionError inside the mutation 
> processing path.
> h2. *Stacktrace*
> {code:java}
> ERROR [MutationStage-6] 2025-11-13 11:48:24,335 JVMStabilityInspector.java:68 
> - Exception in thread 
> Thread[MutationStage-6,10,SharedPool]java.lang.RuntimeException: 
> java.lang.AssertionError   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:108)   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>    at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>    at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>    at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
> java.lang.AssertionError: null   at 
> org.apache.cassandra.utils.memory.ByteBufferCloner.clone(ByteBufferCloner.java:73)
>    at 
> org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:131) 
>   at 
> org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:120) 
>   at 
> org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:91)
>    at 
> org.apache.cassandra.db.MutableDeletionInfo.clone(MutableDeletionInfo.java:33)
>    at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDeltaInternal(AtomicBTreePartition.java:132)
>    at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:192)
>    at 
> org.apache.cassandra.db.memtable.SkipListMemtable.put(SkipListMemtable.java:135)
>    at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1424)  
>  at 
> org.apache.cassandra.db.CassandraTableWriteHandler.write(CassandraTableWriteHandler.java:40)
>    at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:672)   at 
> org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:489)   at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:228)   at 
> org.apache.cassandra.hints.Hint.applyFuture(Hint.java:109)   at 
> org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:116)   
> at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)   
> at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)   ... 6 
> common frames omitted{code}
>  
> h2. *Observed Behavior*
>  * Node logs show *AssertionError* during mutation apply.
>  * Tracing reports failing mutations with {{{}FAILURE_RSP{}}}.
>  * Only operations involving the range delete ({{{}rank >= ?{}}}) cause the 
> assertion.
>  * Removing the DELETE eliminates the issue entirely.
> h2. *Expected Behavior*
>  * Range deletes on clustering key ({{{}DELETE ... AND rank >= ?{}}}) should 
> be processed safely.
>  * No assertion failures during normal write path operations.
>  * Tombstone handling in {{RangeTombstoneList.clone()}} should not produce 
> null buffer conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to