There is setting in the cassandra.yaml file which controls that.

# Whether or not a snapshot is taken of the data before keyspace truncation
# or dropping of column families. The STRONGLY advised default of true 
# should be used to provide data safety. If you set this flag to false, you will
# lose data on truncation or drop.
auto_snapshot: true


----- Original Message -----
From: "Víctor Hugo Oliveira Molinar" <vhmoli...@gmail.com>
To: user@cassandra.apache.org
Sent: Tuesday, March 19, 2013 11:50:35 AM
Subject: Truncate behaviour

Hello guys! 
I'm researching the behaviour for truncate operations at cassandra. 


Reading the oficial wiki page( http://wiki.apache.org/cassandra/API ) we can 
understand it as: 

"Removes all the rows from the given column family." 


And reading the DataStax page( 
http://www.datastax.com/docs/1.0/references/cql/TRUNCATE ) we can understand it 
as: 
" A TRUNCATE statement results in the immediate, irreversible removal of all 
data in the named column family." 


But I think there is a missing and important point about truncate operations. 
At least at 1.2.0 version, whenever I run a truncate operation, C* 
automatically creates a snapshot file of the column family, resulting in a fake 
free disk space. 

I'm intentionally mentioning 'fake free disk space' because I only figured it 
out when the machine disk space was at high usage. 




- Is it a security C* behaviour of creating snapshots for each CF before 
truncate operation? 
- In my scenario I need to purge my column family data every day. 
I thought that truncate could handle it based at the docs. But it doesnt. 
And since I dont want to manually delete those snapshots, I'd like to know if 
there is a safe and practical way to perform a daily purge of this CF data. 



Thanks in advance!

Reply via email to