For the purposes of clearing out disk space, you might also occasionally check 
to see if you have snapshots that you no longer need.  Certain operations 
create snapshots (point-in-time backups of sstables) in the (default) 
/var/lib/cassandra/data/<keyspace_name>/snapshots directory.

If you are absolutely sure that you no longer need a particular snapshot of the 
sstables, you can reclaim a decent amount of space that way.

I'm not sure of all of the other GC discussion going on but that's one way to 
reclaim some space.

On May 26, 2011, at 1:09 PM, Konstantin Naryshkin wrote:

> I have a basic understanding of how Cassandra handles the file system 
> (flushes in Memtables out to SSTables, SSTables get compacted) and I 
> understand that old files are only deleted when a node is restarted, when 
> Java does a GC, or when Cassandra feels like it is running out of space.
> 
> My question is, is there some way for us to hurry the process along? We have 
> a data that we do a lot of inserts into and then delete the data several 
> hours later. We would like it if we could free up disk space (since our 
> disks, though large, are shared with other applications). So far, the action 
> sequence to accomplish this is:
> nodetoo flush -> nodetool repair -> nodetool compact -> ??
> 
> Is there a way for me to make (or even gently suggest to) Cassandra that it 
> may be a good time to free up some space?

Reply via email to