You are right, jvm gc is for memory. In cassandra, there is a small trick called *PhantomReference*, which will be called when jvm gc. And deletion is actually done in PhantomReference.
2010/12/2 Ying Tang <ivytang0...@gmail.com> > @Chen Xinli > "and mark old sstables as deleted which will be deleted while jvm gc." > SSTable is on the harddisk , how could jvm gc delete it ? JVM GC is in > charge the using of the space in the memory. > > @Nick > The GC in cassandra doesn't refer to jvm gc ? This kind of gc is > cassandda's gc , intend to remove the unused file on harddisk ? > > > > On Wed, Dec 1, 2010 at 10:54 PM, Chen Xinli <chen.d...@gmail.com> wrote: > >> >> >> 2010/12/1 Ying Tang <ivytang0...@gmail.com> >> >>> I'm confused , plz ingore the mail above. >>> Here is my confusion , >>> posterior to 0.6.6/0.7 , minor compaction and major compaction both >>> can clean out rows 'tagged' tombstones , and generate a new , without >>> tombstones , sstable . >>> >> >> This is right. >> >> >>> And the tombstones remains in memory ,waiting to be removed by jvm gc >>> . >>> Am i right? >>> >> >> No! Compactions merge several old sstables into one, and mark old sstables >> as deleted which will be deleted while jvm gc. >> SSTable are files on harddisk, nothing to do with memory. You'd better >> have a look at Google's bigtable paper. >> >> >>> >>> On Wed, Dec 1, 2010 at 9:10 PM, Ying Tang <ivytang0...@gmail.com>wrote: >>> >>>> 1. So posterior to 0.6.6/0.7 , minor compaction and major compaction >>>> both can clean out rows 'tagged' tombstones , this kind of clean out >>>> doesn't mead remove it from the disk permanently. >>>> The real remove is done by the jvm GC ? >>>> 2. The intence of compaction is merging multi sstables into one , clean >>>> out the tombstone , let the un-tombstones rows be into a new ordered >>>> sstable ? >>>> >>>> >>>> >>>> On Wed, Dec 1, 2010 at 7:30 PM, Sylvain Lebresne <sylv...@yakaz.com>wrote: >>>> >>>>> On Wed, Dec 1, 2010 at 12:11 PM, Ying Tang <ivytang0...@gmail.com> >>>>> wrote: >>>>> > And i have another question , what's the difference between minor >>>>> > compaction and major compaction? >>>>> >>>>> A major compaction is a compaction that compact *all* the SSTables of a >>>>> given >>>>> column family (compaction compacts one CF at a time). >>>>> >>>>> Before https://issues.apache.org/jira/browse/CASSANDRA-1074 >>>>> (introduced in 0.6.6 and >>>>> recent 0.7 betas/rcs), major compactions where the only ones that >>>>> removed the >>>>> tombstones (see http://wiki.apache.org/cassandra/DistributedDeletes) >>>>> and this is the >>>>> reason major compaction exists. Now, with #1074, minor compactions >>>>> should remove most >>>>> if not all tombstones, so major compaction are not or much less useful >>>>> (it may depend on your >>>>> workload though as minor can't always delete the tombstones). >>>>> >>>>> -- >>>>> Sylvain >>>>> >>>>> > >>>>> > On 12/1/10, Chen Xinli <chen.d...@gmail.com> wrote: >>>>> >> 2010/12/1 Ying Tang <ivytang0...@gmail.com> >>>>> >> >>>>> >>> Every time cassandra creates a new sstable , it will call the >>>>> >>> CompactionManager.submitMinorIfNeeded ? And if the number of >>>>> memtables is >>>>> >>> beyond MinimumCompactionThreshold , the minor compaction will be >>>>> called. >>>>> >>> And there is also a method named CompactionManager.submitMajor , >>>>> and the >>>>> >>> call relationship is : >>>>> >>> >>>>> >>> NodeCmd -- > NodeProbe -->StorageService.forceTableCompaction --> >>>>> >>> Table.forceCompaction -->CompactionManager.performMajor --> >>>>> >>> CompactionManager.submitMajor >>>>> >>> >>>>> >>> ColumnFamilyStore.forceMajorCompaction --> >>>>> CompactionManager.performMajor >>>>> >>> --> CompactionManager.submitMajor >>>>> >>> >>>>> >>> >>>>> >>> HintedHandOffManager >>>>> >>> --> CompactionManager.submitMajor >>>>> >>> >>>>> >>> So i have 3 questions: >>>>> >>> 1. Once a new sstable has been created , >>>>> >>> CompactionManager.submitMinorIfNeeded will be called , >>>>> minorCompaction >>>>> >>> maybe called . >>>>> >>> But when will the majorCompaction be called ? Just the NodeCmd >>>>> ? >>>>> >>> >>>>> >> >>>>> >> Yes, majorCompaction must be called manually from NodeCmd >>>>> >> >>>>> >> >>>>> >>> 2. Which jobs will minorCompaction and majorCompaction do ? >>>>> >>> Will minorCompaction delete the data that have been marked as >>>>> deleted >>>>> >>> ? >>>>> >>> And how about the major compaction ? >>>>> >>> >>>>> >> >>>>> >> Compaction only mark sstables as deleted. Deletion will be done when >>>>> there >>>>> >> are full gc, or node restarted. >>>>> >> >>>>> >> >>>>> >>> 3. When gc be called ? Every time compaction been called? >>>>> >>> >>>>> >> >>>>> >> GC has nothing to do with compaction, you may mistake the two >>>>> conceptions >>>>> >> >>>>> >> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> -- >>>>> >>> Best regards, >>>>> >>> >>>>> >>> Ivy Tang >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >> >>>>> >> >>>>> >> -- >>>>> >> Best Regards, >>>>> >> Chen Xinli >>>>> >> >>>>> > >>>>> > >>>>> > -- >>>>> > Best regards, >>>>> > >>>>> > Ivy Tang >>>>> > >>>>> >>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> >>>> Ivy Tang >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Best regards, >>> >>> Ivy Tang >>> >>> >>> >>> >> >> >> -- >> Best Regards, >> Chen Xinli >> > > > > -- > Best regards, > > Ivy Tang > > > > -- Best Regards, Chen Xinli