I believe this is caused by two things (and sorry if I go into too much details):
1) there is http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives. That is, Cassandra has to wait GCGraceSeconds before really remove physically deleted columns. And by default, this is 10 days. For "normal" column (not the Expiring ones of the patch), this is mitigated by the fact that only a marker that the column has been deleted is kept. That is, if you have 1K columns each holding a blob of 50Mo and you deletes them, after the first compaction the blobs are deleted but not the columns. So you end up with you 1K columns but they now are small. Those column will really be deleted only during a compaction that occurs GCGraceSeconds after the deletion. But for ExpiringColumn there is 2). 2) For expired columns, the value is not deleted until the whole column is removed, that is, even though the column doesn't show up in a request, nothing gets deleted before GCGraceSeconds after the column expiration. Theoretically, what is done for deleted columns could be done for expiring columns, that is, when a column is expired, the value could be removed even though the column is kept as a marker. However this is a bit technically tricky. The natural place to do such thing would be when the column is serialized to disk. But the size of the serialized column has to be known before the actual serialization during the row indexing. So column that expiring in the time between the indexing and its serialization would screw us up. There would be way to get around that but they are not without default and since until now I have been able to live with that, I've moved this 'optimisation' for later. If you yourself cannot leave with that for now, feel free to let me now. -- Sylvain On Wed, Mar 17, 2010 at 10:28 PM, Weijun Li <weiju...@gmail.com> wrote: > I'm testing the ExpiringColumn patch in 0.6-beta2, inserted 26GB data with > TTL, after columns have expired I use get_slice to verify that no columns > can be retrieved. When I run "nodetool compact" I think all data should be > gone. But the problem is: > > 1) After the first nodetool-comact, Cassandra duplicate data files to > data-377* and then nothing happened. Total files size become 52GB. Some 0 > bytes *.Compacted files got generated. > 2) After the second nodetool-compact, Cassandra again generated data-378*. > Now I got 77GB data file that contains no valid columns. (See the list at > the end) > 3) Now I decided to run nodetool-clean and it ended up with 50GB data files > like: > total 53717104 > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 17:25 data-378-Compacted > -rw-rw-r-- 1 cassandra cassandra 25563592504 Mar 17 16:25 data-378-Data.db > -rw-rw-r-- 1 cassandra cassandra 54326245 Mar 17 16:25 data-378-Filter.db > -rw-rw-r-- 1 cassandra cassandra 1871937928 Mar 17 16:25 data-378-Index.db > -rw-rw-r-- 1 cassandra cassandra 25563592504 Mar 17 17:25 data-379-Data.db > -rw-rw-r-- 1 cassandra cassandra 27163165 Mar 17 17:25 data-379-Filter.db > -rw-rw-r-- 1 cassandra cassandra 1871937928 Mar 17 17:25 data-379-Index.db > > Any idea about what's going on here? I guess cleanup will remove all columns > and don't belong this node but compact will remove all deleted columns then > merge small files into a big one. What exactly are the differences between > cleanup and compact? > > -Weijun > > total 80615576 > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 15:27 data-327-Compacted > -rw-rw-r-- 1 cassandra cassandra 21013367426 Mar 16 17:43 data-327-Data.db > -rw-rw-r-- 1 cassandra cassandra 44660005 Mar 16 17:43 data-327-Filter.db > -rw-rw-r-- 1 cassandra cassandra 1538760208 Mar 16 17:43 data-327-Index.db > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 15:27 data-363-Compacted > -rw-rw-r-- 1 cassandra cassandra 2767150915 Mar 16 17:46 data-363-Data.db > -rw-rw-r-- 1 cassandra cassandra 5890885 Mar 16 17:46 data-363-Filter.db > -rw-rw-r-- 1 cassandra cassandra 202590655 Mar 16 17:46 data-363-Index.db > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 15:27 data-370-Compacted > -rw-rw-r-- 1 cassandra cassandra 1383745492 Mar 16 17:47 data-370-Data.db > -rw-rw-r-- 1 cassandra cassandra 2947045 Mar 16 17:47 data-370-Filter.db > -rw-rw-r-- 1 cassandra cassandra 101350867 Mar 16 17:47 data-370-Index.db > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 15:27 data-375-Compacted > -rw-rw-r-- 1 cassandra cassandra 345870869 Mar 16 17:50 data-375-Data.db > -rw-rw-r-- 1 cassandra cassandra 736405 Mar 16 17:50 data-375-Filter.db > -rw-rw-r-- 1 cassandra cassandra 25315970 Mar 16 17:50 data-375-Index.db > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 15:27 data-376-Compacted > -rw-rw-r-- 1 cassandra cassandra 53457802 Mar 16 18:52 data-376-Data.db > -rw-rw-r-- 1 cassandra cassandra 113853 Mar 16 18:52 data-376-Filter.db > -rw-rw-r-- 1 cassandra cassandra 3920228 Mar 16 18:52 data-376-Index.db > -rw-rw-r-- 1 cassandra cassandra 0 Mar 17 16:25 data-377-Compacted > -rw-rw-r-- 1 cassandra cassandra 25563592504 Mar 17 15:27 data-377-Data.db > -rw-rw-r-- 1 cassandra cassandra 54327685 Mar 17 15:27 data-377-Filter.db > -rw-rw-r-- 1 cassandra cassandra 1871937928 Mar 17 15:27 data-377-Index.db > -rw-rw-r-- 1 cassandra cassandra 25563592504 Mar 17 16:25 data-378-Data.db > -rw-rw-r-- 1 cassandra cassandra 54326245 Mar 17 16:25 data-378-Filter.db > -rw-rw-r-- 1 cassandra cassandra 1871937928 Mar 17 16:25 data-378-Index.db >