Cassandra devs, I have a question about the implementation of PartitionUpdate.singleRowUpdate(), in particular the choice to use EncodingStats.NO_STATS when building the resulting PartitionUpdate. Is there a functional reason for that -- i.e., is it safe to modify it to use an EncodingStats built from deletionInfo, row, and staticRow?
Context: under 3.0.17, we have a table using TWCS and a secondary index. We've been having a problem with the sstables for the index lingering essentially forever, despite the correlated sstables for the parent table being removed pretty much when we expect them to. We traced the problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which is being used to create the index updates when we write to the parent table. It appears that NO_STATS is making Cassandra think the memtables for the index have data from September 2015 in them, which in turn prevents it from dropping expired sstables (all of which are much more recent than that) for the index. Experimentally, modifying singleRowUpdate() to build an EncodingStats from its inputs (plus the MutableDeletionInfo it creates) seems to fix the problem. We don't have any insight into why the existing logic uses NO_STATS, however, so we don't know if this change is really safe. Does it sound like we're on the right track? (Also: I'm sure we'd be happy to open an issue and submit a patch if this sounds like it would be useful generally.) Thanks, SK --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org