You will only have tombstones in your data if you issue deletes.

What you are seeing is an artifact of the fundamental way Cassandra stores 
data. Once data is written to disk it is never modified. If you overwrite a 
column value that has already been committed to disk the old value is not 
changed. Instead the new value is held in memory and some time later it is 
written to a new file (more info here 
http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/)

Compaction not only kersplats data that has been deleted, it kapows data that 
has been over written. (See this link for a dramatic first person re-creation 
of compaction removing an overwritten value http://goo.gl/4TrB6 )
 
By overwriting all the data so often you are somewhat fighting against the 
server But there are some things you can try (am assuming 0.8.6, some general 
background http://www.datastax.com/docs/0.8/operations/tuning)

* reduce the min_compaction_threshold on the CF so that data on disk gets 
compacted more frequently. 
* look at the logs to too see why / when memtables are been flushed, look for 
lines like 
 
        INFO [ScheduledTasks:1] 2011-10-02 22:32:20,092 ColumnFamilyStore.java 
(line 1128) Enqueuing flush of 
Memtable-NoCache_Ascending@921142878(2175000/13267958 serialized/live bytes, 
43500 ops)
        or
        WARN [ScheduledTasks:1] 2011-10-02 22:32:20,084 GCInspector.java (line 
143) Heap is 0.778906484049155 full. You may need to reduce memtable and/or 
cache sizes. Cassandra will now flush up to the two largest memtables to free 
up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically

* The memtable will be flushed to disk for 1 of 3 reasons:
        * The Heap is too full and cassandra wants to free memory
        * It has passed the memtable_operations CF threshold for changes, 
increase this value to flush less
        * It has passed the memtable_throughput CF threshold for throughput, 
increase this value to flush less
        (background 
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/)

* is possible reduce the amount of overwrites.  

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6/10/2011, at 2:42 PM, Derek Andree wrote:

> We have a very hot CF which we use essentially as a durable memory cache for 
> our application.  It is about 70MBytes in size after being fully populated.  
> We completely overwrite this entire CF every few minutes (not delete).  Our 
> hope was that the CF would stay around 70MB in size, but it grows to multiple 
> Gigabytes in size rather quickly (less than an hour).  I've heard that doing 
> major compactions using nodetool is no longer recommended, but when we force 
> a compaction on this CF using nodetool compact, then perform GC, size on disk 
> shrinks to the expected 70MB.
> 
> I'm wondering if we are doing something wrong here, we thought we were 
> avoiding tombstones since we are just overwriting each column using the same 
> keys.  Is the fact that we have to do a GC to get the size on disk to shrink 
> significantly a smoking gun that we have a bunch of tombstones?
> 
> We've row cached the entire CF to make reads really fast, and writes are 
> definitely fast enough, it's this growing disk space that has us concerned.
> 
> Here's the output from nodetool cfstats for the CF in question (hrm, I just 
> noticed that we still have a key cache for this CF which is rather dumb):
> 
>               Column Family: Test
>               SSTable count: 4
>               Space used (live): 309767193
>               Space used (total): 926926841
>               Number of Keys (estimate): 275456
>               Memtable Columns Count: 37510
>               Memtable Data Size: 15020598
>               Memtable Switch Count: 22
>               Read Count: 4827496
>               Read Latency: 0.010 ms.
>               Write Count: 1615946
>               Write Latency: 0.095 ms.
>               Pending Tasks: 0
>               Key cache capacity: 150000
>               Key cache size: 55762
>               Key cache hit rate: 0.030557854052177317
>               Row cache capacity: 150000
>               Row cache size: 68752
>               Row cache hit rate: 1.0
>               Compacted row minimum size: 925
>               Compacted row maximum size: 1109
>               Compacted row mean size: 1109
> 
> 
> Any insight appreciated.
> 
> Thanks,
> -Derek
> 

Reply via email to