I'm still wondering about how to chose the size of the sstable under LCS. Defaul is 5MB, people use to configure it to 10MB and now you configure it at 128MB. What are the benefits or inconveniants of a very small size (let's say 5 MB) vs big size (like 128MB) ?
Alain 2013/3/8 Al Tobey <a...@ooyala.com> > We saw the exactly the same thing as Wei Zhu, > 100k tables in a directory > causing all kinds of issues. We're running 128MiB ssTables with LCS and > have disabled compaction throttling. 128MiB was chosen to get file counts > under control and reduce the number of files C* has to manage & search. I > just looked and a ~250GiB node is using about 10,000 files, which is quite > manageable. This configuration is running smoothly in production under > mixed read/write load. > > We're on RAID0 across 6 15k drives per machine. When we migrated data to > this cluster we were pushing well over 26k/s+ inserts with CL_QUORUM. With > compaction throttling enabled at any rate it just couldn't keep up. With > throttling off, it runs smoothly and does not appear to have an impact on > our applications, so we always leave it off, even in EC2. An 8GiB heap is > too small for this config on 1.1. YMMV. > > -Al Tobey > > On Thu, Feb 14, 2013 at 12:51 PM, Wei Zhu <wz1...@yahoo.com> wrote: > >> I haven't tried to switch compaction strategy. We started with LCS. >> >> For us, after massive data imports (5000 w/seconds for 6 days), the first >> repair is painful since there is quite some data inconsistency. For 150G >> nodes, repair brought in about 30 G and created thousands of pending >> compactions. It took almost a day to clear those. Just be prepared LCS is >> really slow in 1.1.X. System performance degrades during that time since >> reads could go to more SSTable, we see 20 SSTable lookup for one read.. (We >> tried everything we can and couldn't speed it up. I think it's single >> threaded.... and it's not recommended to turn on multithread compaction. We >> even tried that, it didn't help )There is parallel LCS in 1.2 which is >> supposed to alleviate the pain. Haven't upgraded yet, hope it works:) >> >> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 >> >> >> Since our cluster is not write intensive, only 100 w/seconds. I don't see >> any pending compactions during regular operation. >> >> One thing worth mentioning is the size of the SSTable, default is 5M >> which is kind of small for 200G (all in one CF) data set, and we are on >> SSD. It more than 150K files in one directory. (200G/5M = 40K SSTable and >> each SSTable creates 4 files on disk) You might want to watch that and >> decide the SSTable size. >> >> By the way, there is no concept of Major compaction for LCS. Just for >> fun, you can look at a file called $CFName.json in your data directory and >> it tells you the SSTable distribution among different levels. >> >> -Wei >> >> ------------------------------ >> *From:* Charles Brophy <cbro...@zulily.com> >> *To:* user@cassandra.apache.org >> *Sent:* Thursday, February 14, 2013 8:29 AM >> *Subject:* Re: Size Tiered -> Leveled Compaction >> >> I second these questions: we've been looking into changing some of our >> CFs to use leveled compaction as well. If anybody here has the wisdom to >> answer them it would be of wonderful help. >> >> Thanks >> Charles >> >> On Wed, Feb 13, 2013 at 7:50 AM, Mike <mthero...@yahoo.com> wrote: >> >> Hello, >> >> I'm investigating the transition of some of our column families from Size >> Tiered -> Leveled Compaction. I believe we have some high-read-load column >> families that would benefit tremendously. >> >> I've stood up a test DB Node to investigate the transition. I >> successfully alter the column family, and I immediately noticed a large >> number (1000+) pending compaction tasks become available, but no compaction >> get executed. >> >> I tried running "nodetool sstableupgrade" on the column family, and the >> compaction tasks don't move. >> >> I also notice no changes to the size and distribution of the existing >> SSTables. >> >> I then run a major compaction on the column family. All pending >> compaction tasks get run, and the SSTables have a distribution that I would >> expect from LeveledCompaction (lots and lots of 10MB files). >> >> Couple of questions: >> >> 1) Is a major compaction required to transition from size-tiered to >> leveled compaction? >> 2) Are major compactions as much of a concern for LeveledCompaction as >> their are for Size Tiered? >> >> All the documentation I found concerning transitioning from Size Tiered >> to Level compaction discuss the alter table cql command, but I haven't >> found too much on what else needs to be done after the schema change. >> >> I did these tests with Cassandra 1.1.9. >> >> Thanks, >> -Mike >> >> >> >> >> >