Re: Ec2 Stress Results

Alex Araujo Fri, 06 May 2011 17:13:33 -0700

Pardon the long delay - went on holiday and got sidetracked before Icould return to this project.

@Joaquin - The DataStax AMI uses a RAID0 configuration on an instancestore's ephemeral drives.

@Jonathan - you were correct about the client node being thebottleneck. I setup 3 XL client instances to run contrib/stress back onthe 4 node XL Cassandra cluster and incrementally raised number ofthreads on the clients until I started seeing timeouts.


I set the following mem settings for the client JVMs: -Xms2G -Xmx10G

I raised the default MAX_HEAP setting from the AMI to 12GB (~80% ofavailable memory). I used the default AMI cassandra.yaml settings forthe Cassandra nodes until timeouts started appearing, and then raisedconcurrent_writes to 300 based on a (perhaps arbitrary?) recommendationin 'Cassandra: The Definitive Guide' that recommended raising thatnumber based on number of client threads (timeouts started appearing at200 threads per client; 600 total threads). The client nodes were inthe same AZ as the Cassandra nodes, and I set the --keep-going option onthe clients for every other run >= 200 threads.


Results
+----------+----------+----------+----------+----------+----------+----------+

| Server | Client | --keep- | Columns | Client | Total |Combined || Nodes | Nodes | going | | Threads | Threads |Rate |

+==========+==========+==========+==========+==========+==========+==========+

| 4 | 3 | N | 10000000 | 25 | 75 |13771 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 10000000 | 50 | 150 |16853 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 10000000 | 75 | 225 |18511 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 10000000 | 150 | 450 |20013 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 7574241 | 200 | 600 |22935 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | Y | 10000000 | 200 | 600 |19737 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 9843677 | 250 | 750 |20869 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | Y | 10000000 | 250 | 750 |21217 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | N | 5015711 | 300 | 900 |24177 |

+----------+----------+----------+----------+----------+----------+----------+

| 4 | 3 | Y | 10000000 | 300 | 900 |206134 |

+----------+----------+----------+----------+----------+----------+----------+

Other Observations
* `vmstat` showed no swapping during runs

* `iostat -x` always showed 0's for avgqu-sz, await, and %util on the/raid0 (data) partition; 0-150, 0-334ms, and 0-60% respectively for the/ (commitlog) partition* %steal from iostat ranged from 8-26% every run (one node had an almostconstant 26% while the others averaged closer to 10%)* `nodetool tpstats` never showed more than 10's of Pending ops inRequestResponseStage; no more than 1-2K Pending ops in MutationStage.Usually a single node would register ops; the others would be 0's* After all test runs, Memtable Switch Count was 1385 forKeyspace1.Standard1* Load average on the Cassandra nodes was very high the entire time,especially for tests where each client ran > 100 threads. Here's onesample @ 200 threads each (600 total):


[i-94e8d2fb] alex@cassandra-qa-1:~$ uptime
17:18:26 up 1 day, 19:04,  2 users,  load average: 20.18, 15.20, 12.87
[i-a0e5dfcf] alex@cassandra-qa-2:~$ uptime
17:18:26 up 1 day, 18:52,  2 users,  load average: 22.65, 25.60, 21.71
[i-92dde7fd] alex@cassandra-qa-3:~$ uptime
17:18:26 up 1 day, 18:44,  2 users,  load average: 24.19, 28.29, 20.17
[i-08caf067] alex@cassandra-qa-4:~$ uptime
17:18:26 up 1 day, 18:37,  2 users,  load average: 31.74, 20.99, 13.97

* Average resource utilization on the client nodes was between 10-80%CPU; 5-25% memory depending on # of threads. Load average was alwaysnegligible (presumably because there was no I/O)* After a few runs and truncate operations on Keyspace1.Standard1, thering became unbalanced before runs:


[i-94e8d2fb] alex@cassandra-qa-1:~$ nodetool -h localhost ring
Address         Status State   Load            Owns    Token

127605887595351923798765477786913079296

10.240.114.143  Up     Normal  2.1 GB          25.00%  0

10.210.154.63 Up Normal 330.19 MB 25.00%4253529586511730793292182592897102643210.110.63.247 Up Normal 361.38 MB 25.00%8507059173023461586584365185794205286410.46.143.223 Up Normal 1.6 GB 25.00%127605887595351923798765477786913079296


and after runs:

[i-94e8d2fb] alex@cassandra-qa-1:~$ nodetool -h localhost ring
Address         Status State   Load            Owns    Token

127605887595351923798765477786913079296

10.240.114.143  Up     Normal  3.9 GB          25.00%  0

10.210.154.63 Up Normal 2.05 GB 25.00%4253529586511730793292182592897102643210.110.63.247 Up Normal 2.07 GB 25.00%8507059173023461586584365185794205286410.46.143.223 Up Normal 3.33 GB 25.00%127605887595351923798765477786913079296

Based on the above, would I be correct in assuming that frequentmemtable flushes and/or commitlog I/O are the likely bottlenecks? Could%steal be partially contributing to the low throughput numbers as well?If a single XL node can do ~12k writes/s, would it be reasonable toexpect ~40k writes/s with the above work load and number of nodes?


Thanks for your help, Alex.

On 4/25/11 11:23 AM, Joaquin Casares wrote:

Did the images have EBS storage or Instance Store storage?

Typically EBS volumes aren't the best to be benchmarking against:
http://www.mail-archive.com/user@cassandra.apache.org/msg11022.html

Joaquin Casares
DataStax
Software Engineer/Support

On Wed, Apr 20, 2011 at 5:12 PM, Jonathan Ellis <jbel...@gmail.com<mailto:jbel...@gmail.com>> wrote:


    A few months ago I was seeing 12k writes/s on a single EC2 XL. So
    something is wrong.

    My first suspicion is that your client node may be the bottleneck.

    On Wed, Apr 20, 2011 at 2:56 PM, Alex Araujo
    <cassandra-us...@alex.otherinbox.com
    <mailto:cassandra-us...@alex.otherinbox.com>> wrote:

> Does anyone have any Ec2 benchmarks/experiences they can share?I am trying

    > to get a sense for what to expect from a production cluster on
    Ec2 so that I
    > can compare my application's performance against a sane
    baseline.  What I
    > have done so far is:
    >
    > 1. Lunched a 4 node cluster of m1.xlarge instances in the same
    availability
    > zone using PyStratus
    (https://github.com/digitalreasoning/PyStratus).  Each
    > node has the following specs (according to Amazon):
    > 15 GB memory
    > 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
    > 1,690 GB instance storage
    > 64-bit platform
    >
    > 2. Changed the default PyStratus directories in order to have
    commit logs on
    > the root partition and data files on ephemeral storage:
    > commitlog_directory: /var/cassandra-logs
    > data_file_directories: [/mnt/cassandra-data]
    >
    > 2. Gave each node 10GB of MAX_HEAP; 1GB HEAP_NEWSIZE in
    > conf/cassandra-env.sh
    >
    > 3. Ran `contrib/stress/bin/stress -d node1,..,node4 -n 10000000
    -t 100` on a
    > separate m1.large instance:
    > total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
    > ...
    > 9832712,7120,7120,0.004948514851485148,842
    > 9907616,7490,7490,0.0043189949802413755,852
    > 9978357,7074,7074,0.004560353967289125,863
    > 10000000,2164,2164,0.004065933558194335,867
    >
    > 4. Truncated Keyspace1.Standard1:
    > # /usr/local/apache-cassandra/bin/cassandra-cli -host localhost
    -port 9160
    > Connected to: "Test Cluster" on x.x.x.x/9160
    > Welcome to cassandra CLI.
    >
    > Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
    > [default@unknown] use Keyspace1;
    > Authenticated to keyspace: Keyspace1
    > [default@Keyspace1] truncate Standard1;
    > null
    >
    > 5. Expanded the cluster to 8 nodes using PyStratus and sanity
    checked using
    > nodetool:
    > # /usr/local/apache-cassandra/bin/nodetool -h localhost ring
    > Address         Status State   Load            Owns
    > Token
    > x.x.x.x  Up     Normal  1.3 GB          12.50%
    > 21267647932558653966460912964485513216
    > x.x.x.x   Up     Normal  3.06 GB         12.50%
    > 42535295865117307932921825928971026432
    > x.x.x.x     Up     Normal  1.16 GB         12.50%
    > 63802943797675961899382738893456539648
    > x.x.x.x   Up     Normal  2.43 GB         12.50%
    > 85070591730234615865843651857942052864
    > x.x.x.x   Up     Normal  1.22 GB         12.50%
    > 106338239662793269832304564822427566080
    > x.x.x.x    Up     Normal  2.74 GB         12.50%
    > 127605887595351923798765477786913079296
    > x.x.x.x    Up     Normal  1.22 GB         12.50%
    > 148873535527910577765226390751398592512
    > x.x.x.x   Up     Normal  2.57 GB         12.50%
    > 170141183460469231731687303715884105728
    >
    > 6. Ran `contrib/stress/bin/stress -d node1,..,node8 -n 10000000
    -t 100` on a
    > separate m1.large instance again:
    > total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
    > ...
    > 9880360,9649,9649,0.003210443956226165,720
    > 9942718,6235,6235,0.003206934154398794,731
    > 9997035,5431,5431,0.0032615939761032457,741
    > 10000000,296,296,0.002660033726812816,742
    >
    > In a nutshell, 4 nodes inserted at 11,534 writes/sec and 8 nodes
    inserted at
    > 13,477 writes/sec.
    >
    > Those numbers seem a little low to me, but I don't have anything
    to compare
    > to.  I'd like to hear others' opinions before I spin my wheels
    with with
    > number of nodes, threads,  memtable, memory, and/or GC
    settings.  Cheers,
    > Alex.
    >



    --
    Jonathan Ellis
    Project Chair, Apache Cassandra
    co-founder of DataStax, the source for professional Cassandra support
    http://www.datastax.com

Re: Ec2 Stress Results

Reply via email to