Thank you very much for the swift answer. I have one more question about the second part. Can method calculate non-overlapping keys as overlapping? I mean it uses max and min tokens and column count. They can be very close to each other if random keys are used.
In my use case I generate a GUID for each key and send a single write request. Cem On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita <mor.y...@gmail.com> wrote: > > Why does Cassandra single table compaction skips the keys that are in > the other sstables? > > because we don't want to resurrect deleted columns. Say, sstable A has > the column with timestamp 1, and sstable B has the same column which > deleted at timestamp 2. Then if we purge that column only from sstable > B, we would see the column with timestamp 1 again. > > > I also dont understand why we have this line in worthDroppingTombstones > method > > What the method is trying to do is to "guess" how many columns that > are not in the rows that don't overlap, without actually going through > every rows in the sstable. We have statistics like column count > histogram, min and max row token for every sstables, we use those in > the method to estimate how many columns the two sstables overlap. > You may have remainingColumnsRatio of 0 when the two sstables overlap > almost entirely. > > > On Tue, May 21, 2013 at 3:43 PM, cem <cayiro...@gmail.com> wrote: > > Hi all, > > > > I have a question about ticket > > https://issues.apache.org/jira/browse/CASSANDRA-3442 > > > > Why does Cassandra single table compaction skips the keys that are in the > > other sstables? Please correct if I am wrong. > > > > I also dont understand why we have this line in worthDroppingTombstones > > method: > > > > double remainingColumnsRatio = ((double) columns) / > > (sstable.getEstimatedColumnCount().count() * > > sstable.getEstimatedColumnCount().mean()); > > > > remainingColumnsRatio is always 0 in my case and the droppableRatio is > > 0.9. Cassandra skips all sstables which are already expired. > > > > This line was introduced by > > https://issues.apache.org/jira/browse/CASSANDRA-4022. > > > > Best Regards, > > Cem > > > > -- > Yuki Morishita > t:yukim (http://twitter.com/yukim) >