Hello John,

Below are the situations where a snapshot is automatically taken on 1.2:

1. during compactions, if snapshot_before_compaction yaml option is set to
true, in this case, the snapshot name will be <timestamp>-compact-<cf>.
2. when you drop or truncate a CF, when auto_snapshot yaml option is set to
true (default) - in this case, the snapshot name will be the timestamp of
the operation.
3. nodetool scrub, if you don't pass the parameter "-no-snapshot" - in this
case, the snapshot name will be pre-scrub-<timestamp>.
4. nodetool repair (sequential, the default), unless -par option is
specified - in this case, the snapshot name will be the UUID that
identifies the repair session (in all nodes that participate in the repair).

If a repair hangs, or for some reason doesn't terminate successfully, the
snapshot will not be cleaned and you must clean it manually.

We're working on a patch for cassandra 2.1+ to automatically cleanup repair
snapshots during next node initialization, which should alleviate this a
bit (https://issues.apache.org/jira/browse/CASSANDRA-7357). Right now I
suggest you set up alarms to detect this anomalous situation.

Cheers,

Paulo


2015-07-16 23:36 GMT-03:00 John Wong <gokoproj...@gmail.com>:

> Hi all
>
> Quick questions.
>
> I was auditing disk usage of my cluster (cassandra 1.2.19). I found there
> was a node with 27G worth of snapshots in OpsCenter data directory. I don't
> remember doing any snapshots...
>
> I do run nodetool repair -pr every night, so they might be created by the
> repair process. But how come they are not removed? Most importantly, every
> node has different snapshot taken.... most of the nodes have one or two
> opscenter snapshots taken, but one node has a dozen total of 27G taken and
> left bheind.
>
> 1. how can I identify how a snapshot was taken?
> 2. why does every node has different snapshot taken?
> 3. if snapshots are taken during repair, should they be deleted
> automatically?
>
> Thanks.
>
> John
>

Reply via email to