On Thu, May 15, 2014 at 10:17 AM, Redmumba <redmu...@gmail.com> wrote:
> Is this possible to do safely? The data in the oldest sstable is always > guaranteed to be the oldest data, so that is not my concern--my main > concern is whether or not we can even do this, and also how we can notify > Cassandra that an sstable has been removed underneath it. > > tl;dr: Can I routinely remove the oldest sstable to free up disk space, > without causing stability drops in Cassandra? > tl;dr : no. There is no mechanism by which to inform a running Cassandra process that you consider a SSTable to no longer be "live". It would probably be pretty trivial to add a JMX call which did this, but I presume the project would not merge it. Especially because it would be marked "live" again if you restarted, until/unless CASSANDRA-6756 [1] is resolved in some way. There are also likely cases where brute force removing data in the oldest sstable file (tombstones, for example) will lead to unexpected results while querying or during compaction. Generally, Cassandra wants to manage SSTables in the data directory. It does not want you to do so while the server is running. If you delete a SSTable which Cassandra has an open file handle to, it will not be deleted until Cassandra no longer has an open file handle to it, which will only occur at node shutdown or post-compaction. You could always stop the node, remove the SSTable, and restart the node. But you are almost certainly better off using the 2.0/2.1 era stuff for cases like this which relies on TTL to drop SSTables on the floor when they are entirely full of expired data. There's another recent thread which discusses some of these features, I am not personally clear on exactly what cases like yours they cover. =Rob [1] https://issues.apache.org/jira/browse/CASSANDRA-6756