On Wed, Jun 18, 2014 at 10:56 AM, Brian Tarbox <tar...@cabotresearch.com> wrote:
> I have a column family that only stores the last 5 days worth of some > data...and yet I have files in the data directory for this CF that are 3 > weeks old. > Are you using TTL? If so : https://issues.apache.org/jira/browse/CASSANDRA-6654 Are you using size tiered or level compaction? I have six bunches of these file groups, each with a different nnnn > value...and with timestamps of each of the last five days...plus one group > from 3 weeks ago...which makes me wonder if that group somehow should have > been deleted but were not. > > The files are tens or hundreds of gigs so deleting would be good, unless > its really bad! > Data files can't be deleted from the data dir with Cassandra running, but it should be fine (if probably technically unsupported) to delete them with Cassandra stopped. In most cases you don't want to do so, because you might un-mask deleted rows or cause unexpected consistency characteristics. In your case, you know that no data in files created 3 weeks old can possibly have any value, so it is safe to delete them. =Rob