At the DataStax Cassandra Summit 2013 last week, Al Tobey from Ooyala recommended ss_table_size_in_mb be set at 256mb unless you have a fairly small data set. The talk was "Extreme Cassandra Optimization," and it was superbly informative, I highly recommend it once DataStax gets the videos online.
On Mon, Jun 17, 2013 at 1:35 AM, Wei Zhu <wz1...@yahoo.com> wrote: > Correction, the largest I heard is 256MB SSTable size. > > ------------------------------ > *From: *"Wei Zhu" <wz1...@yahoo.com> > *To: *user@cassandra.apache.org > *Sent: *Sunday, June 16, 2013 10:28:25 PM > > *Subject: *Re: Large number of files for Leveled Compaction > > default value of 5MB is way too small in practice. Too many files in one > directory is not a good thing. It's not clear what should be a good number. > I have heard people are using 50MB, 75MB, even 100MB. Do your own test o > find a "right" number. > > -Wei > > ------------------------------ > *From: *"Franc Carter" <franc.car...@sirca.org.au> > *To: *user@cassandra.apache.org > *Sent: *Sunday, June 16, 2013 10:15:22 PM > *Subject: *Re: Large number of files for Leveled Compaction > > > > On Mon, Jun 17, 2013 at 2:59 PM, Manoj Mainali <mainalima...@gmail.com>wrote: > >> Not in the case of LeveledCompaction. Only SizeTieredCompaction merges >> smaller sstables into large ones. With the LeveledCompaction, the sstables >> are always of fixed size but they are grouped into different levels. >> >> You can refer to this page >> http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra on >> details of how LeveledCompaction works. >> >> > Yes, but it seems I've misinterpreted that page ;-( > > I took this paragraph > > In figure 3, new sstables are added to the first level, L0, and >> immediately compacted with the sstables in L1 (blue). When L1 fills up, >> extra sstables are promoted to L2 (violet). Subsequent sstables generated >> in L1 will be compacted with the sstables in L2 with which they overlap. As >> more data is added, leveled compaction results in a situation like the one >> shown in figure 4. >> > > to mean that once a level fills up it gets compacted into a higher level > > cheers > > >> Cheers >> Manoj >> >> >> On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter >> <franc.car...@sirca.org.au>wrote: >> >>> On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali >>> <mainalima...@gmail.com>wrote: >>> >>>> With LeveledCompaction, each sstable size is fixed and is defined by >>>> sstable_size_in_mb in the compaction configuration of CF definition and >>>> default value is 5MB. In you case, you may have not defined your own value, >>>> that is why your each sstable is 5MB. And if you dataset is huge, you will >>>> see a lot of sstable counts. >>>> >>> >>> >>> Ok, seems like I do have (at least) an incomplete understanding. I >>> realise that the minimum size is 5MB, but I thought compaction would merge >>> these into a smaller number of larger sstables ? >>> >>> thanks >>> >>> >>>> Cheers >>>> >>>> Manoj >>>> >>>> >>>> On Fri, Jun 7, 2013 at 1:44 PM, Franc Carter <franc.car...@sirca.org.au >>>> > wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> We are trialling Cassandra-1.2(.4) with Leveled compaction as it looks >>>>> like it may be a win for us. >>>>> >>>>> The first step of testing was to push a fairly large slab of data into >>>>> the Column Family - we did this much faster (> x100) than we would in a >>>>> production environment. This has left the Column Family with about 140,000 >>>>> files in the Column Family directory which seems way too high. On two of >>>>> the nodes the CompactionStats show 2 outstanding tasks and on a third node >>>>> there are over 13,000 outstanding tasks. However from looking at the log >>>>> activity it looks like compaction has finished on all nodes. >>>>> >>>>> Is this number of files expected/normal ? >>>>> >>>>> cheers >>>>> >>>>> -- >>>>> >>>>> *Franc Carter* | Systems architect | Sirca Ltd >>>>> <marc.zianideferra...@sirca.org.au> >>>>> >>>>> franc.car...@sirca.org.au | www.sirca.org.au >>>>> >>>>> Tel: +61 2 8355 2514 >>>>> >>>>> Level 4, 55 Harrington St, The Rocks NSW 2000 >>>>> >>>>> PO Box H58, Australia Square, Sydney NSW 1215 >>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> *Franc Carter* | Systems architect | Sirca Ltd >>> <marc.zianideferra...@sirca.org.au> >>> >>> franc.car...@sirca.org.au | www.sirca.org.au >>> >>> Tel: +61 2 8355 2514 >>> >>> Level 4, 55 Harrington St, The Rocks NSW 2000 >>> >>> PO Box H58, Australia Square, Sydney NSW 1215 >>> >>> >>> >> > > > -- > > *Franc Carter* | Systems architect | Sirca Ltd > <marc.zianideferra...@sirca.org.au> > > franc.car...@sirca.org.au | www.sirca.org.au > > Tel: +61 2 8355 2514 > > Level 4, 55 Harrington St, The Rocks NSW 2000 > > PO Box H58, Australia Square, Sydney NSW 1215 > > > > >