Re: Many really small SSTables

Roland Etzenhammer Fri, 16 Jan 2015 06:38:31 -0800

Hi,

I overlooked that disussion. Indeed right now those column family hasvery much write load and only very minimal reads on it (but there arereads). I run compactionstats serveral times over the last days andsometimes a small bunch of tables gets compacted, about 10-15. Afterthat there are still more than 3300 tables smaller 100kB on the node Iposted the numbers from. Another node has far more sstables.


It got Tylers suggestion and issued

ALTER TABLE <tablename> WITH compaction = {'class': 
'SizeTieredCompactionStrategy', 'min_threshold': '4', 'max_threshold': '32', 
'cold_reads_to_omit': 0.0};

to the cluser right away and will report what happens after that -weekend is coming so there is plenty time.

Short update while writing this - I take a look at compaction stats annow there are pending tasks on every node. So it seems to have at leastsome effect.

Maybe someone will find this while googling, I got some backgroundinformation about that parameter from here:

http://www.datastax.com/dev/blog/optimizations-around-cold-sstables

Thanks so far, and I wish you all a nice weekend,
Roland

Am 16.01.2015 um 15:13 schrieb Eric Stevens:

There's another thread going on right now in this list aboutcompactions not happening when they seemingly should. Tyler Hobbspostulates a bug and workaround for it, so maybe try that out, and ifthat fixes anything for you, certainly let him know. The bug Tylerpostulates on is triggered when you have a write-heavy with zero readworkload, and if you're testing data loading, maybe you're triggeringthat.

Also, it's probably a long shot, but make sure that your SSTablecounts haven't gone down since you last looked, if you're load testingyour cluster and throwing big bursts of writes at it, you cantemporarily fall behind on compaction (getting a large number ofsstables), and if you stop writes, you can catch back up again as aresult - maybe you looked at table counts during loading, and lookedat compactionstats during a quiet time.

On Fri, Jan 16, 2015 at 12:37 AM, Jan Kesten <j.kes...@enercast.de<mailto:j.kes...@enercast.de>> wrote:


    Hi Eric and all,

    I almost expected this kind answer. I did a nodetool
    compactionstats already to see if those sstables are beeing
    compacted, but on all nodes there are 0 outstanding compactions
    (right now in the morning, not running any tests on this cluster).

    The reported read latency is about 1-3ms and on nodes which have
    many sstables (new highscore are ~18k sstables). The 99%
    percentile is about 30-40 micros and a cell count of about 80-90
    (if I got the docs right these are the number of sstables
    accessed, that changed from 2.0 to 2.1 I think as I see this only
    on testing cluster).

    I looks to me that compactions were not triggered. I tried a
    nodetool compact on one node overnight - but that crashed the
    entire node.

    Roland

    Am 15.01.2015 um 19:14 schrieb Eric Stevens:

    Yes, many sstables can have a huge negative impact read
    performance, and will also create memory pressure on that node.

    There are a lot of things which can produce this effect, and it
    strongly also suggests you're falling behind on compaction in
    general (check nodetool compactionstats, you should have <5
    outstanding/pending, preferably 0-1).  To see whether and how
    much it is impacting your read performance, check nodetool
    cfstats <keyspace.table> and nodetool cfhistograms <keyspace>
    <table>.


    On Thu, Jan 15, 2015 at 2:11 AM, Roland Etzenhammer
    <r.etzenham...@t-online.de <mailto:r.etzenham...@t-online.de>> wrote:

        Hi,

        I'm testing around with cassandra fair a bit, using 2.1.2
        which I know has some major issues,but it is a test
        environment. After some bulk loading, testing with
        incremental repairs and running out of heap once I found that
        now I have a quit large number of sstables which are really
        small:

        <1k              0      0,0%
        <10k          2780     76,8%
        <100k         3392     93,7%
        <1000k        3461     95,6%
        <10000k       3471     95,9%
        <100000k      3517     97,1%
        <1000000k     3596     99,3%
        all           3621    100,0%

        76,8% of all sstables in this particular column familiy are
        smaller that 10kB, 93.7% are smaller then 100kB.

        Just for my understanding - does that impact performance? And
        is there any way to reduce the number of sstables? A full run
        of nodetool compact is running for a really long time (more
        than 1day).

        Thanks for any input,
        Roland

--i.A. Jan Kesten Systemadministration enercast GmbH Friedrich -

    Ebert - Straße 104 D–34119 Kassel Tel.: +49 561 / 4739664-0
    <tel:%2B49%20561%20%2F%204739664-0> Fax: (+49)561/4739664-9
    <tel:%28%2B49%29561%2F4739664-9> mailto: j.kes...@enercast.de
    <mailto:j.kes...@enercast.de> http://www.enercast.de AG Kassel HRB
    15471 Thomas Landgraf Geschäftsführer t.landg...@enercast.de
    <mailto:t.landg...@enercast.de> Tel.: (+49)561/4739664-0
    <tel:%28%2B49%29561%2F4739664-0> FAX: -9 Mobil: (+49)172/6565087
    <tel:%28%2B49%29172%2F6565087> enercast GmbH Friedrich-Ebert-Str.
    104 D-34119 Kassel HRB15471 http://www.enercast.de
    Online-Prognosen für erneuerbare Energien Geschäftsführung: Thomas
    Landgraf (CEO), Bernd Kratz (CTO), Philipp Rinder (CSO) Diese
    E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich
    geschützte Informationen enthalten. Falls Sie nicht der angegebene
    Empfänger sind oder falls diese E-Mail irrtümlich an Sie
    adressiert wurde, benachrichtigen Sie uns bitte sofort durch
    Antwort-E-Mail und löschen Sie diese E-Mail nebst etwaigen Anlagen
    von Ihrem System. Ebenso dürfen Sie diese E-Mail oder ihre Anlagen
    nicht kopieren oder an Dritte weitergeben. Vielen Dank. This
    e-mail and any attachment may contain confidential and/or
    privileged information. If you are not the named addressee or if
    this transmission has been addressed to you in error, please
    notify us immediately by reply e-mail and then delete this e-mail
    and any attachment from your system. Please understand that you
    must not copy this e-mail or any attachment or disclose the
    contents to any other person. Thank you for your cooperation.

Re: Many really small SSTables

Reply via email to