[ 
https://issues.apache.org/jira/browse/CASSANDRA-20646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abe Ratnofsky updated CASSANDRA-20646:
--------------------------------------
    Description: 
During repair validation and streaming, persistent memtables are not flushed 
but instead produce untracked SSTables that only exist in the context of that 
repair session. In 
org.apache.cassandra.db.ColumnFamilyStore#writeMemtableRanges, there is 
currently no synchronization of memtable contents with ongoing writes, which 
could cause a streamed SSTable to contain a torn view of a memtable’s data and 
metadata. For example, a new column could be present in data but not added to 
ColumnsCollector metadata, which would essentially be an invalid SSTable.

This impacts Mutation Tracking because an SSTable’s CoordinatorLogBoundaries 
must be representative of all MutationIds included in the SSTable, and if a 
persistent memtable streams an SSTable with torn data and metadata, this 
invariant may be broken.

To fix, enforce an OpOrder like flush does, waiting for all pending writes to 
complete before calling getFlushSet.

This impacts CEP-45 (Mutation Tracking) which tracks which mutation IDs are 
present in an SSTable via StatsMetadata, in order to determine when to mark an 
SSTable as repaired. Currently enablement of both mutation tracking and 
persistent memtables is rejected, because repair may stream an effectively 
corrupted SSTable due to lack of synchronization here.

  was:
During repair validation and streaming, persistent memtables are not flushed 
but instead produce untracked SSTables that only exist in the context of that 
repair session. In 
org.apache.cassandra.db.ColumnFamilyStore#writeMemtableRanges, there is 
currently no synchronization of memtable contents with ongoing writes, which 
could cause a streamed SSTable to contain a torn view of a memtable’s data and 
metadata. For example, a new column could be present in data but not added to 
ColumnsCollector metadata, which would essentially be an invalid SSTable.

This impacts Mutation Tracking because an SSTable’s CoordinatorLogBoundaries 
must be representative of all MutationIds included in the SSTable, and if a 
persistent memtable streams an SSTable with torn data and metadata, this 
invariant may be broken.

To fix, enforce an OpOrder like flush does, waiting for all pending writes to 
complete before calling getFlushSet.


> Persistent memtables do not enforce OpOrder during repair
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-20646
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20646
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Memtable
>            Reporter: Abe Ratnofsky
>            Priority: Normal
>
> During repair validation and streaming, persistent memtables are not flushed 
> but instead produce untracked SSTables that only exist in the context of that 
> repair session. In 
> org.apache.cassandra.db.ColumnFamilyStore#writeMemtableRanges, there is 
> currently no synchronization of memtable contents with ongoing writes, which 
> could cause a streamed SSTable to contain a torn view of a memtable’s data 
> and metadata. For example, a new column could be present in data but not 
> added to ColumnsCollector metadata, which would essentially be an invalid 
> SSTable.
> This impacts Mutation Tracking because an SSTable’s CoordinatorLogBoundaries 
> must be representative of all MutationIds included in the SSTable, and if a 
> persistent memtable streams an SSTable with torn data and metadata, this 
> invariant may be broken.
> To fix, enforce an OpOrder like flush does, waiting for all pending writes to 
> complete before calling getFlushSet.
> This impacts CEP-45 (Mutation Tracking) which tracks which mutation IDs are 
> present in an SSTable via StatsMetadata, in order to determine when to mark 
> an SSTable as repaired. Currently enablement of both mutation tracking and 
> persistent memtables is rejected, because repair may stream an effectively 
> corrupted SSTable due to lack of synchronization here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to