Hello John, Below are the situations where a snapshot is automatically taken on 1.2:
1. during compactions, if snapshot_before_compaction yaml option is set to true, in this case, the snapshot name will be <timestamp>-compact-<cf>. 2. when you drop or truncate a CF, when auto_snapshot yaml option is set to true (default) - in this case, the snapshot name will be the timestamp of the operation. 3. nodetool scrub, if you don't pass the parameter "-no-snapshot" - in this case, the snapshot name will be pre-scrub-<timestamp>. 4. nodetool repair (sequential, the default), unless -par option is specified - in this case, the snapshot name will be the UUID that identifies the repair session (in all nodes that participate in the repair). If a repair hangs, or for some reason doesn't terminate successfully, the snapshot will not be cleaned and you must clean it manually. We're working on a patch for cassandra 2.1+ to automatically cleanup repair snapshots during next node initialization, which should alleviate this a bit (https://issues.apache.org/jira/browse/CASSANDRA-7357). Right now I suggest you set up alarms to detect this anomalous situation. Cheers, Paulo 2015-07-16 23:36 GMT-03:00 John Wong <gokoproj...@gmail.com>: > Hi all > > Quick questions. > > I was auditing disk usage of my cluster (cassandra 1.2.19). I found there > was a node with 27G worth of snapshots in OpsCenter data directory. I don't > remember doing any snapshots... > > I do run nodetool repair -pr every night, so they might be created by the > repair process. But how come they are not removed? Most importantly, every > node has different snapshot taken.... most of the nodes have one or two > opscenter snapshots taken, but one node has a dozen total of 27G taken and > left bheind. > > 1. how can I identify how a snapshot was taken? > 2. why does every node has different snapshot taken? > 3. if snapshots are taken during repair, should they be deleted > automatically? > > Thanks. > > John >