Re: Compaction Strategy guidance

Andrei Ivanov Mon, 24 Nov 2014 06:13:32 -0800

Nikolai,

Are you sure about 1.26Gb? Like it doesn't look right - 5195 tables
with 256Mb table size...


Andrei

On Mon, Nov 24, 2014 at 5:09 PM, Nikolai Grigoriev <ngrigor...@gmail.com> wrote:
> Jean-Armel,
>
> I have only two large tables, the rest is super-small. In the test cluster
> of 15 nodes the largest table has about 110M rows. Its total size is about
> 1,26Gb per node (total disk space used per node for that CF). It's got about
> 5K sstables per node - the sstable size is 256Mb. cfstats on a "healthy"
> node look like this:
>
>     Read Count: 8973748
>     Read Latency: 16.130059053251774 ms.
>     Write Count: 32099455
>     Write Latency: 1.6124713938912671 ms.
>     Pending Tasks: 0
>         Table: wm_contacts
>         SSTable count: 5195
>         SSTables in each level: [27/4, 11/10, 104/100, 1053/1000, 4000, 0,
> 0, 0, 0]
>         Space used (live), bytes: 1266060391852
>         Space used (total), bytes: 1266144170869
>         SSTable Compression Ratio: 0.32604853410787327
>         Number of keys (estimate): 25696000
>         Memtable cell count: 71402
>         Memtable data size, bytes: 26938402
>         Memtable switch count: 9489
>         Local read count: 8973748
>         Local read latency: 17.696 ms
>         Local write count: 32099471
>         Local write latency: 1.732 ms
>         Pending tasks: 0
>         Bloom filter false positives: 32248
>         Bloom filter false ratio: 0.50685
>         Bloom filter space used, bytes: 20744432
>         Compacted partition minimum bytes: 104
>         Compacted partition maximum bytes: 3379391
>         Compacted partition mean bytes: 172660
>         Average live cells per slice (last five minutes): 495.0
>         Average tombstones per slice (last five minutes): 0.0
>
> Another table of similar structure (same number of rows) is about 4x times
> smaller. That table does not suffer from those issues - it compacts well and
> efficiently.
>
> On Mon, Nov 24, 2014 at 2:30 AM, Jean-Armel Luce <jaluc...@gmail.com> wrote:
>>
>> Hi Nikolai,
>>
>> Please could you clarify a little bit what you call "a large amount of
>> data" ?
>>
>> How many tables ?
>> How many rows in your largest table ?
>> How many GB in your largest table ?
>> How many GB per node ?
>>
>> Thanks.
>>
>>
>>
>> 2014-11-24 8:27 GMT+01:00 Jean-Armel Luce <jaluc...@gmail.com>:
>>>
>>> Hi Nikolai,
>>>
>>> Thanks for those informations.
>>>
>>> Please could you clarify a little bit what you call "
>>>
>>> 2014-11-24 4:37 GMT+01:00 Nikolai Grigoriev <ngrigor...@gmail.com>:
>>>>
>>>> Just to clarify - when I was talking about the large amount of data I
>>>> really meant large amount of data per node in a single CF (table). LCS does
>>>> not seem to like it when it gets thousands of sstables (makes 4-5 levels).
>>>>
>>>> When bootstraping a new node you'd better enable that option from
>>>> CASSANDRA-6621 (the one that disables STCS in L0). But it will still be a
>>>> mess - I have a node that I have bootstrapped ~2 weeks ago. Initially it 
>>>> had
>>>> 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does not go
>>>> down. Number of sstables at L0  is over 11K and it is slowly slowly 
>>>> building
>>>> upper levels. Total number of sstables is 4x the normal amount. Now I am 
>>>> not
>>>> entirely sure if this node will ever get back to normal life. And believe 
>>>> me
>>>> - this is not because of I/O, I have SSDs everywhere and 16 physical cores.
>>>> This machine is barely using 1-3 cores at most of the time. The problem is
>>>> that allowing STCS fallback is not a good option either - it will quickly
>>>> result in a few 200Gb+ sstables in my configuration and then these sstables
>>>> will never be compacted. Plus, it will require close to 2x disk space on
>>>> EVERY disk in my JBOD configuration...this will kill the node sooner or
>>>> later. This is all because all sstables after bootstrap end at L0 and then
>>>> the process slowly slowly moves them to other levels. If you have write
>>>> traffic to that CF then the number of sstables and L0 will grow quickly -
>>>> like it happens in my case now.
>>>>
>>>> Once something like https://issues.apache.org/jira/browse/CASSANDRA-8301
>>>> is implemented it may be better.
>>>>
>>>>
>>>> On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov <aiva...@iponweb.net>
>>>> wrote:
>>>>>
>>>>> Stephane,
>>>>>
>>>>> We are having a somewhat similar C* load profile. Hence some comments
>>>>> in addition Nikolai's answer.
>>>>> 1. Fallback to STCS - you can disable it actually
>>>>> 2. Based on our experience, if you have a lot of data per node, LCS
>>>>> may work just fine. That is, till the moment you decide to join
>>>>> another node - chances are that the newly added node will not be able
>>>>> to compact what it gets from old nodes. In your case, if you switch
>>>>> strategy the same thing may happen. This is all due to limitations
>>>>> mentioned by Nikolai.
>>>>>
>>>>> Andrei,
>>>>>
>>>>>
>>>>> On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G. <smg...@gmail.com>
>>>>> wrote:
>>>>> > ABUSE
>>>>> >
>>>>> >
>>>>> >
>>>>> > YA NO QUIERO MAS MAILS SOY DE MEXICO
>>>>> >
>>>>> >
>>>>> >
>>>>> > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
>>>>> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
>>>>> > Para: user@cassandra.apache.org
>>>>> > Asunto: Re: Compaction Strategy guidance
>>>>> > Importancia: Alta
>>>>> >
>>>>> >
>>>>> >
>>>>> > Stephane,
>>>>> >
>>>>> > As everything good, LCS comes at certain price.
>>>>> >
>>>>> > LCS will put most load on you I/O system (if you use spindles - you
>>>>> > may need
>>>>> > to be careful about that) and on CPU. Also LCS (by default) may fall
>>>>> > back to
>>>>> > STCS if it is falling behind (which is very possible with heavy
>>>>> > writing
>>>>> > activity) and this will result in higher disk space usage. Also LCS
>>>>> > has
>>>>> > certain limitation I have discovered lately. Sometimes LCS may not be
>>>>> > able
>>>>> > to use all your node's resources (algorithm limitations) and this
>>>>> > reduces
>>>>> > the overall compaction throughput. This may happen if you have a
>>>>> > large
>>>>> > column family with lots of data per node. STCS won't have this
>>>>> > limitation.
>>>>> >
>>>>> >
>>>>> >
>>>>> > By the way, the primary goal of LCS is to reduce the number of
>>>>> > sstables C*
>>>>> > has to look at to find your data. With LCS properly functioning this
>>>>> > number
>>>>> > will be most likely between something like 1 and 3 for most of the
>>>>> > reads.
>>>>> > But if you do few reads and not concerned about the latency today,
>>>>> > most
>>>>> > likely LCS may only save you some disk space.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay
>>>>> > <sle...@looplogic.com>
>>>>> > wrote:
>>>>> >
>>>>> > Hi there,
>>>>> >
>>>>> >
>>>>> >
>>>>> > use case:
>>>>> >
>>>>> >
>>>>> >
>>>>> > - Heavy write app, few reads.
>>>>> >
>>>>> > - Lots of updates of rows / columns.
>>>>> >
>>>>> > - Current performance is fine, for both writes and reads..
>>>>> >
>>>>> > - Currently using SizedCompactionStrategy
>>>>> >
>>>>> >
>>>>> >
>>>>> > We're trying to limit the amount of storage used during compaction.
>>>>> > Should
>>>>> > we switch to LeveledCompactionStrategy?
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Nikolai Grigoriev
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nikolai Grigoriev
>>>>
>>>
>>
>
>
>
> --
> Nikolai Grigoriev
>

Re: Compaction Strategy guidance

Reply via email to