300 GB is a lot of data for cloud machines (especially with their
weaker performance in general). If you are unhappy with performance
why not scale the cluster out to more servers, with that much data you
are usually contending with the physics of spinning disks. Three nodes
+ replication factor 3 also means all your data is replicated to every
node and your not getting any scale out benefits.

On Fri, Feb 8, 2013 at 8:03 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
> Hi,
>
> I have some big latencies (OpsCenter homepage shows an average about 30-60
> ms), inducing instability in my front servers, stacking queries, waiting for
> C* to answer, in the following 1.1.6 C* cluster:
>
> 10.208.45.173   eu-west     1b          Up     Normal  297.02 GB
> 100.00%             0
> 10.208.40.6      eu-west     1b          Up     Normal  292.91 GB
> 100.00%             56713727820156407428984779325531226112
> 10.208.47.135   eu-west     1b          Up     Normal  307.96 GB
> 100.00%             113427455640312814857969558651062452224
>
> I run on 3 AWS m1.xLarge with mostly the Datastax AMI default node
> configuration (But with the following options MAX_HEAP_SIZE="8G"
> HEAP_NEWSIZE="400M", I was under regular memory pressure with the default
> 4GB heap, maybe because of bloomfilters). RF = 3, CL = QUORUM r/w.
>
> I have a high load from 4 to 15 with an average of 8 (mainly because of
> iowait which can reach up to 40-60%).
>
> extract from "iostat -mx 5 10" :
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>                 16.66    0.00    4.82         35.47       0.21   42.85
>
>
> I use compression and Size Tiered Compaction Strategy for any of my CF.
>
> A typical CF :
>
> create column family active_product
>   with column_type = 'Standard'
>   and comparator = 'UTF8Type'
>   and default_validation_class = 'UTF8Type'
>   and key_validation_class = 'UTF8Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 12
>   and replicate_on_write = true
>   and compaction_strategy =
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and bloom_filter_fp_chance = 0.01
>   and compression_options = {'sstable_compression' :
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
>
> And there is a typical counter CF:
>
> create column family algo_product_view
>   with column_type = 'Standard'
>   and comparator = 'UTF8Type'
>   and default_validation_class = 'CounterColumnType'
>   and key_validation_class = 'UTF8Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 12
>   and replicate_on_write = true
>   and compaction_strategy =
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and bloom_filter_fp_chance = 0.01
>   and compression_options = {'sstable_compression' :
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
>
> I attach my cfhistograms, proxyhistograms, cfstats and tpstats hopping a
> clue is somewhere in there, even if I was unable to learn something there by
> myself.
>
> cfstats: http://pastebin.com/z3sAshjP
> tpstats: http://pastebin.com/LETPqfLV
> proxyhistograms: http://pastebin.com/FqwMFrxG
> cfhistograms (from the 2 most read / highest latencies):
> http://pastebin.com/BCsdc50z & http://pastebin.com/CGZZpydL
>
> These latencies are quite annoying, hope you'll help me figuring out what I
> am doing wrong or how I can tune Cassandra better.
>
> Alain

Reply via email to