On Mon, Jun 17, 2013 at 3:37 PM, Franc Carter <franc.car...@sirca.org.au>wrote:
> On Mon, Jun 17, 2013 at 3:28 PM, Wei Zhu <wz1...@yahoo.com> wrote: > >> default value of 5MB is way too small in practice. Too many files in one >> directory is not a good thing. It's not clear what should be a good number. >> I have heard people are using 50MB, 75MB, even 100MB. Do your own test o >> find a "right" number. >> > > Interesting - 50MB is the low end of what people are using - 5MB is a lot > lower. I'll try a 50MB set > Oops, forgot to ask - is there a way to get Cassandra to rebuild the sstables as bigger once I have updated the column family definition ? thanks > > cheers > > >> -Wei >> >> ------------------------------ >> *From: *"Franc Carter" <franc.car...@sirca.org.au> >> *To: *user@cassandra.apache.org >> *Sent: *Sunday, June 16, 2013 10:15:22 PM >> *Subject: *Re: Large number of files for Leveled Compaction >> >> >> >> >> On Mon, Jun 17, 2013 at 2:59 PM, Manoj Mainali <mainalima...@gmail.com>wrote: >> >>> Not in the case of LeveledCompaction. Only SizeTieredCompaction merges >>> smaller sstables into large ones. With the LeveledCompaction, the sstables >>> are always of fixed size but they are grouped into different levels. >>> >>> You can refer to this page >>> http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra on >>> details of how LeveledCompaction works. >>> >>> >> Yes, but it seems I've misinterpreted that page ;-( >> >> I took this paragraph >> >> In figure 3, new sstables are added to the first level, L0, and >>> immediately compacted with the sstables in L1 (blue). When L1 fills up, >>> extra sstables are promoted to L2 (violet). Subsequent sstables generated >>> in L1 will be compacted with the sstables in L2 with which they overlap. As >>> more data is added, leveled compaction results in a situation like the one >>> shown in figure 4. >>> >> >> to mean that once a level fills up it gets compacted into a higher level >> >> cheers >> >> >>> Cheers >>> Manoj >>> >>> >>> On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter <franc.car...@sirca.org.au >>> > wrote: >>> >>>> On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali >>>> <mainalima...@gmail.com>wrote: >>>> >>>>> With LeveledCompaction, each sstable size is fixed and is defined by >>>>> sstable_size_in_mb in the compaction configuration of CF definition and >>>>> default value is 5MB. In you case, you may have not defined your own >>>>> value, >>>>> that is why your each sstable is 5MB. And if you dataset is huge, you will >>>>> see a lot of sstable counts. >>>>> >>>> >>>> >>>> Ok, seems like I do have (at least) an incomplete understanding. I >>>> realise that the minimum size is 5MB, but I thought compaction would merge >>>> these into a smaller number of larger sstables ? >>>> >>>> thanks >>>> >>>> >>>>> Cheers >>>>> >>>>> Manoj >>>>> >>>>> >>>>> On Fri, Jun 7, 2013 at 1:44 PM, Franc Carter < >>>>> franc.car...@sirca.org.au> wrote: >>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> We are trialling Cassandra-1.2(.4) with Leveled compaction as it >>>>>> looks like it may be a win for us. >>>>>> >>>>>> The first step of testing was to push a fairly large slab of data >>>>>> into the Column Family - we did this much faster (> x100) than we would >>>>>> in >>>>>> a production environment. This has left the Column Family with about >>>>>> 140,000 files in the Column Family directory which seems way too high. On >>>>>> two of the nodes the CompactionStats show 2 outstanding tasks and on a >>>>>> third node there are over 13,000 outstanding tasks. However from looking >>>>>> at >>>>>> the log activity it looks like compaction has finished on all nodes. >>>>>> >>>>>> Is this number of files expected/normal ? >>>>>> >>>>>> cheers >>>>>> >>>>>> -- >>>>>> >>>>>> *Franc Carter* | Systems architect | Sirca Ltd >>>>>> <marc.zianideferra...@sirca.org.au> >>>>>> >>>>>> franc.car...@sirca.org.au | www.sirca.org.au >>>>>> >>>>>> Tel: +61 2 8355 2514 >>>>>> >>>>>> Level 4, 55 Harrington St, The Rocks NSW 2000 >>>>>> >>>>>> PO Box H58, Australia Square, Sydney NSW 1215 >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> *Franc Carter* | Systems architect | Sirca Ltd >>>> <marc.zianideferra...@sirca.org.au> >>>> >>>> franc.car...@sirca.org.au | www.sirca.org.au >>>> >>>> Tel: +61 2 8355 2514 >>>> >>>> Level 4, 55 Harrington St, The Rocks NSW 2000 >>>> >>>> PO Box H58, Australia Square, Sydney NSW 1215 >>>> >>>> >>>> >>> >> >> >> -- >> >> *Franc Carter* | Systems architect | Sirca Ltd >> <marc.zianideferra...@sirca.org.au> >> >> franc.car...@sirca.org.au | www.sirca.org.au >> >> Tel: +61 2 8355 2514 >> >> Level 4, 55 Harrington St, The Rocks NSW 2000 >> >> PO Box H58, Australia Square, Sydney NSW 1215 >> >> >> >> > > > -- > > *Franc Carter* | Systems architect | Sirca Ltd > <marc.zianideferra...@sirca.org.au> > > franc.car...@sirca.org.au | www.sirca.org.au > > Tel: +61 2 8355 2514 > > Level 4, 55 Harrington St, The Rocks NSW 2000 > > PO Box H58, Australia Square, Sydney NSW 1215 > > > -- *Franc Carter* | Systems architect | Sirca Ltd <marc.zianideferra...@sirca.org.au> franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 8355 2514 Level 4, 55 Harrington St, The Rocks NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215