He has around 10G of data so should not be bad. This problem is if you have lot of data.
On Thu, Jul 11, 2013 at 2:10 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Thu, Jul 11, 2013 at 1:52 PM, sankalp kohli <kohlisank...@gmail.com>wrote: > >> Scrub will keep the file size same. YOu need to move all sstables to be >> L0. the way to do this is to remove the json file which has level >> information. >> > > This will work, but I believe is subject to this? > > "./src/java/org/apache/cassandra/db/compaction/LeveledManifest.java" line > 228 of 577 > " > // LevelDB gives each level a score of how much data it contains > vs its ideal amount, and > // compacts the level with the highest score. But this falls apart > spectacularly once you > // get behind. Consider this set of levels: > // L0: 988 [ideal: 4] > // L1: 117 [ideal: 10] > // L2: 12 [ideal: 100] > // > // The problem is that L0 has a much higher score (almost 250) > than L1 (11), so what we'll > // do is compact a batch of MAX_COMPACTING_L0 sstables with all > 117 L1 sstables, and put the > // result (say, 120 sstables) in L1. Then we'll compact the next > batch of MAX_COMPACTING_L0, > // and so forth. So we spend most of our i/o rewriting the L1 > data with each batch. > // > // If we could just do *all* L0 a single time with L1, that would > be ideal. But we can't > // -- see the javadoc for MAX_COMPACTING_L0. > // > // LevelDB's way around this is to simply block writes if L0 > compaction falls behind. > // We don't have that luxury. > // > // So instead, we force compacting higher levels first. This may > not minimize the number > // of reads done as quickly in the short term, but it minimizes > the i/o needed to compact > // optimially which gives us a long term win. > " > > Ideal would be something like a major compaction for LCS which allows end > user to change resulting SSTable sizes without forcing everything back to > L0. > > =Rob > >