Re: Why does recompacting a table with no changes or tombstones result in work?

Robert Coli Mon, 20 Oct 2014 12:03:48 -0700

On Mon, Oct 20, 2014 at 10:24 AM, Redmumba <redmu...@gmail.com> wrote:


> I ran into an interesting issue--when I run compaction on a table that is
> already compacted, it still, well... compacts.  The table's TTL is set to
> 0, there are no deletes or other writes to these tables, and I confirmed
> (on disk) that there was only a single set of files.  There are no changes
> pending in memory for this table, either.
>
> Why is compaction being performed on a table that has no changes?  Are
> there other reasons for compaction to be re-run, or does the compact
> command to nodetool just blindly do what it's told?
>

The latter. The compaction process doesn't "know" that the file is
compacted and it doesn't "know" that there have been no writes since it was
compacted. It also doesn't know that the file doesn't contain TTLed columns
or deletes with timestamps in the future from write time at first
compaction, but which a second compaction would turn into tombstones. Or
tombstones which would be removed after gc_grace_seconds, etc...

In trunk Cassandra, SSTables are marked "repaired" which would prevent them
from repairing. In theory maybe (?) a similar technique be used with
regards to compaction, but Cassandra dev team quite rationally does not
optimize for the case where the dataset is static...

=Rob

Re: Why does recompacting a table with no changes or tombstones result in work?

Reply via email to