On Thu, May 15, 2014 at 10:17 AM, Redmumba <redmu...@gmail.com> wrote:

> Is this possible to do safely?  The data in the oldest sstable is always
> guaranteed to be the oldest data, so that is not my concern--my main
> concern is whether or not we can even do this, and also how we can notify
> Cassandra that an sstable has been removed underneath it.
>
> tl;dr: Can I routinely remove the oldest sstable to free up disk space,
> without causing stability drops in Cassandra?
>

tl;dr : no.

There is no mechanism by which to inform a running Cassandra process that
you consider a SSTable to no longer be "live". It would probably be pretty
trivial to add a JMX call which did this, but I presume the project would
not merge it. Especially because it would be marked "live" again if you
restarted, until/unless CASSANDRA-6756 [1] is resolved in some way.

There are also likely cases where brute force removing data in the oldest
sstable file (tombstones, for example) will lead to unexpected results
while querying or during compaction.

Generally, Cassandra wants to manage SSTables in the data directory. It
does not want you to do so while the server is running. If you delete a
SSTable which Cassandra has an open file handle to, it will not be deleted
until Cassandra no longer has an open file handle to it, which will only
occur at node shutdown or post-compaction.

You could always stop the node, remove the SSTable, and restart the node.
But you are almost certainly better off using the 2.0/2.1 era stuff for
cases like this which relies on TTL to drop SSTables on the floor when they
are entirely full of expired data. There's another recent thread which
discusses some of these features, I am not personally clear on exactly what
cases like yours they cover.

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-6756

Reply via email to