Cassandra devs,

I have a question about the implementation of
PartitionUpdate.singleRowUpdate(), in particular the choice to use
EncodingStats.NO_STATS when building the resulting PartitionUpdate.  Is
there a functional reason for that -- i.e., is it safe to modify it to
use an EncodingStats built from deletionInfo, row, and staticRow?

Context: under 3.0.17, we have a table using TWCS and a secondary index.
We've been having a problem with the sstables for the index lingering
essentially forever, despite the correlated sstables for the parent
table being removed pretty much when we expect them to.  We traced the
problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which
is being used to create the index updates when we write to the parent
table.  It appears that NO_STATS is making Cassandra think the memtables
for the index have data from September 2015 in them, which in turn
prevents it from dropping expired sstables (all of which are much more
recent than that) for the index.

Experimentally, modifying singleRowUpdate() to build an EncodingStats
from its inputs (plus the MutableDeletionInfo it creates) seems to fix
the problem.  We don't have any insight into why the existing logic uses
NO_STATS, however, so we don't know if this change is really safe.  Does
it sound like we're on the right track?  (Also: I'm sure we'd be happy
to open an issue and submit a patch if this sounds like it would be
useful generally.)

Thanks,
SK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to