Hi,
I overlooked that disussion. Indeed right now those column family has
very much write load and only very minimal reads on it (but there are
reads). I run compactionstats serveral times over the last days and
sometimes a small bunch of tables gets compacted, about 10-15. After
that there are still more than 3300 tables smaller 100kB on the node I
posted the numbers from. Another node has far more sstables.
It got Tylers suggestion and issued
ALTER TABLE <tablename> WITH compaction = {'class':
'SizeTieredCompactionStrategy', 'min_threshold': '4', 'max_threshold': '32',
'cold_reads_to_omit': 0.0};
to the cluser right away and will report what happens after that -
weekend is coming so there is plenty time.
Short update while writing this - I take a look at compaction stats an
now there are pending tasks on every node. So it seems to have at least
some effect.
Maybe someone will find this while googling, I got some background
information about that parameter from here:
http://www.datastax.com/dev/blog/optimizations-around-cold-sstables
Thanks so far, and I wish you all a nice weekend,
Roland
Am 16.01.2015 um 15:13 schrieb Eric Stevens:
There's another thread going on right now in this list about
compactions not happening when they seemingly should. Tyler Hobbs
postulates a bug and workaround for it, so maybe try that out, and if
that fixes anything for you, certainly let him know. The bug Tyler
postulates on is triggered when you have a write-heavy with zero read
workload, and if you're testing data loading, maybe you're triggering
that.
Also, it's probably a long shot, but make sure that your SSTable
counts haven't gone down since you last looked, if you're load testing
your cluster and throwing big bursts of writes at it, you can
temporarily fall behind on compaction (getting a large number of
sstables), and if you stop writes, you can catch back up again as a
result - maybe you looked at table counts during loading, and looked
at compactionstats during a quiet time.
On Fri, Jan 16, 2015 at 12:37 AM, Jan Kesten <j.kes...@enercast.de
<mailto:j.kes...@enercast.de>> wrote:
Hi Eric and all,
I almost expected this kind answer. I did a nodetool
compactionstats already to see if those sstables are beeing
compacted, but on all nodes there are 0 outstanding compactions
(right now in the morning, not running any tests on this cluster).
The reported read latency is about 1-3ms and on nodes which have
many sstables (new highscore are ~18k sstables). The 99%
percentile is about 30-40 micros and a cell count of about 80-90
(if I got the docs right these are the number of sstables
accessed, that changed from 2.0 to 2.1 I think as I see this only
on testing cluster).
I looks to me that compactions were not triggered. I tried a
nodetool compact on one node overnight - but that crashed the
entire node.
Roland
Am 15.01.2015 um 19:14 schrieb Eric Stevens:
Yes, many sstables can have a huge negative impact read
performance, and will also create memory pressure on that node.
There are a lot of things which can produce this effect, and it
strongly also suggests you're falling behind on compaction in
general (check nodetool compactionstats, you should have <5
outstanding/pending, preferably 0-1). To see whether and how
much it is impacting your read performance, check nodetool
cfstats <keyspace.table> and nodetool cfhistograms <keyspace>
<table>.
On Thu, Jan 15, 2015 at 2:11 AM, Roland Etzenhammer
<r.etzenham...@t-online.de <mailto:r.etzenham...@t-online.de>> wrote:
Hi,
I'm testing around with cassandra fair a bit, using 2.1.2
which I know has some major issues,but it is a test
environment. After some bulk loading, testing with
incremental repairs and running out of heap once I found that
now I have a quit large number of sstables which are really
small:
<1k 0 0,0%
<10k 2780 76,8%
<100k 3392 93,7%
<1000k 3461 95,6%
<10000k 3471 95,9%
<100000k 3517 97,1%
<1000000k 3596 99,3%
all 3621 100,0%
76,8% of all sstables in this particular column familiy are
smaller that 10kB, 93.7% are smaller then 100kB.
Just for my understanding - does that impact performance? And
is there any way to reduce the number of sstables? A full run
of nodetool compact is running for a really long time (more
than 1day).
Thanks for any input,
Roland
--
i.A. Jan Kesten Systemadministration enercast GmbH Friedrich -
Ebert - Straße 104 D–34119 Kassel Tel.: +49 561 / 4739664-0
<tel:%2B49%20561%20%2F%204739664-0> Fax: (+49)561/4739664-9
<tel:%28%2B49%29561%2F4739664-9> mailto: j.kes...@enercast.de
<mailto:j.kes...@enercast.de> http://www.enercast.de AG Kassel HRB
15471 Thomas Landgraf Geschäftsführer t.landg...@enercast.de
<mailto:t.landg...@enercast.de> Tel.: (+49)561/4739664-0
<tel:%28%2B49%29561%2F4739664-0> FAX: -9 Mobil: (+49)172/6565087
<tel:%28%2B49%29172%2F6565087> enercast GmbH Friedrich-Ebert-Str.
104 D-34119 Kassel HRB15471 http://www.enercast.de
Online-Prognosen für erneuerbare Energien Geschäftsführung: Thomas
Landgraf (CEO), Bernd Kratz (CTO), Philipp Rinder (CSO) Diese
E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich
geschützte Informationen enthalten. Falls Sie nicht der angegebene
Empfänger sind oder falls diese E-Mail irrtümlich an Sie
adressiert wurde, benachrichtigen Sie uns bitte sofort durch
Antwort-E-Mail und löschen Sie diese E-Mail nebst etwaigen Anlagen
von Ihrem System. Ebenso dürfen Sie diese E-Mail oder ihre Anlagen
nicht kopieren oder an Dritte weitergeben. Vielen Dank. This
e-mail and any attachment may contain confidential and/or
privileged information. If you are not the named addressee or if
this transmission has been addressed to you in error, please
notify us immediately by reply e-mail and then delete this e-mail
and any attachment from your system. Please understand that you
must not copy this e-mail or any attachment or disclose the
contents to any other person. Thank you for your cooperation.