Hi, You're sstables are probably falling out of page cache on the smaller nodes and your slow disks are killing your latencies.
Check to see if this is the case with pcstat: https://github.com/tobert/pcstat All the best, [image: datastax_logo.png] <http://www.datastax.com/> Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] <https://twitter.com/datastax> [image: g+.png] <https://plus.google.com/+Datastax/about> <http://feeds.feedburner.com/datastax> <http://goog_410786983> <http://www.datastax.com/gartner-magic-quadrant-odbms> DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Tue, Nov 17, 2015 at 1:33 PM, Antoine Bonavita <anto...@stickyads.tv> wrote: > Hello, > > As I have not heard from anybody on the list, I guess I did not provide > the right kind of information or I did not ask the right question. > > The things I forgot to mention in my previous email: > * Checked the logs without noticing anything out of the ordinary. > Memtables flushes occur every few minutes. > * The compaction has been set to allow only one compaction at a time. > Compaction throughput is the default. > > My question is really: where should I look to investigate deeper ? > I did a lot of reading and watching datastax videos over the past week and > I don't understand what could explain this behavior. > > Or maybe my expectations are too high. But I was under the impression that > this kind of workload (heavy writes) was the sweet spot for Cassandra and > that a node should be able to sustain 10K writes per second without > breaking a sweat. > > Any help is appreciated. Much like any direction on what I should do to > get help. > > Thanks, > > Antoine. > > > On 11/16/2015 10:04 AM, Antoine Bonavita wrote: > >> Hello, >> >> We have a performance problem when trying to ramp up cassandra (as a >> mongo replacement) on a very specific use case. We store a blob indexed >> by a key and expire it after a few days: >> >> CREATE TABLE views.views ( >> viewkey text PRIMARY KEY, >> value blob >> ) WITH bloom_filter_fp_chance = 0.01 >> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >> AND comment = '' >> AND compaction = {'max_sstable_age_days': '10', 'class': >> 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'} >> AND compression = {'sstable_compression': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 432000 >> AND gc_grace_seconds = 172800 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99.0PERCENTILE'; >> >> Our workload is mostly writes (approx. 96 writes for 4 reads). Each >> value is about 3kB. reads are mostly for "fresh" data (ie data that was >> written recently). >> >> I have a 4 nodes cluster with spinning disks and a replication factor of >> 3. For some historical reason 2 of the machines have 32G of RAM and the >> other 2 have 64G. >> >> This is for the context. >> >> Now, when I use this cluster at about 600 writes per second per node >> everything is fine but when I try to ramp it up (1200 writes per second >> per node) the read latencies are fine on the 64G machines but start >> going crazy on the 32G machines. When looking at disk iops, this is >> clearly related: >> * On 32G machines, read iops go from 200 to 1400. >> * On 64G machines, read iops go from 10 to 20. >> >> So I thought this was related to the Memtable being flushed "too early" >> on 32G machines. I increased memtable_heap_space_in_mb to 4G on the 32G >> machines but it did not change anything. >> >> At this point I'm kind of lost and could use any help in understanding >> why I'm generating so many read iops on the 32G machines compared to the >> 64G one and why it goes crazy (x7) when I merely double the load. >> >> Thanks, >> >> A. >> >> > -- > Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv > Tel: +33 6 34 33 47 36/+33 9 50 68 21 32 > NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID >