Re: Understanding Cassandra tuning

Giampaolo Trapasso Fri, 25 Mar 2016 11:03:57 -0700

Yes,

RF=1 as
https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml
 and there you can find also my stress schema. Any case, with a so low
throughput I think I should move to other type of AWS instances before
repeating the test and investigate further on tuning.


Thanks,
Giampaolo



2016-03-25 17:11 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>:

> Your IOPS and throughput seem to be below the AWS limits, but... I wonder
> if replication is doubling those numbers and then a little write
> amplification may then bump you into the AWS limits.
>
> See:
>
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html
>
> What RF are you using? RF=2?
>
> What does your schema look like... are you splitting your multi-GB blobs
> into individual 100K rows?
>
> Oh, wait... the AWS instance type page says that m4.xlarge has a Dedicated
> EBS Throughput (Mbps) rate of only 750. I presume "b" means bit, not byte,
> so that's really 750 /8 = 93.75 MB/sec, which is fairly close to your
> numbers, so just a little write amplification or spiking or fuzzy math on
> AWS end might trigger some AWS throttling.
>
>
>
> -- Jack Krupansky
>
> On Fri, Mar 25, 2016 at 11:42 AM, Giampaolo Trapasso <
> giampaolo.trapa...@radicalbit.io> wrote:
>
>> Hi to all,
>>
>> I want to understand better Cassandra 2.2.5 tuning for my app (and C*
>> tuning in general). In the app I'm developing, the typical scenario is the
>> upload on cluster of a large binary file (in the order of GBs). Long story
>> short, after many failed try to get specific upload throughput with my
>> custom code, I've decided to start from the beginning using stress test on
>> an almost default configured cluster.
>>
>> The cluster is, of course, a development one and is made up by two EC2
>> nodes that are m4.xlarge (EBS optimized). The meaning of this "exercise" is
>> to understand if/how much are these two instances underused, what are the
>> most important tunings to do and of course, gain enough experience to
>> proceed toward production. I do not want to increase cluster size unless
>> there's no room for improvement with current setup.
>>
>> Here is the stress.yaml
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml>
>> and the configuration of one of the two nodes I'm using (Cassandra-env.sh
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-env-sh>,
>> cassandra.yaml
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-yaml>).
>> The binarydata table mimics the way I intend upload data that is as chucks
>> of ~100K bytes at time. This number is a setting of the application I want
>> to develop, but for the moment I want to keep it fixed.
>>
>> I've read the most comprehensive guide on tuning for 2.2 that is
>> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html.
>> It's a quite big topic and surely I scratched only the surface.
>>
>>
>> I ran the stress tool and here you are my basic understandings:
>> Looking at stress results
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-stress-output>
>> 1. Since I write 100k at each insert, running with 4 threadCount at 783
>> ops/s tells me that my write rate is ~78MB/s
>> 2. After 32826 inserts, I begin to have write timeouts (I set 5 seconds
>> on configuration),
>> Looking at dstat
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-dstat-output>
>> 3. I have some blocked process. How can I spot the Java thread that is
>> currently blocked? Nodetool TPStats output
>> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-tpstats-output>
>> at the end of the stress test has blocked column at 0 for all TPs, there's
>> some tool to check the situation while/after the test is running beside the
>> Opcenter?
>> 4. From --dsk/total-- column I see <16MB in read (I suppose the compactor
>> running) and 89MB of max write. I do not understand if this is the limit of
>> the node itself (cannot provide more than that data to disk) or the disk
>> itself. As Opcenter, this seems the limit of the machine itself.
>> 5. A lot of iowait in the ----total-cpu-usage----, Al Tobey says: "Any
>> blocked processes is considered bad and you should immediately look at the
>> iowait %. 1-2% iowait isn't necessarily a problem, but it usually points at
>> storage as a bottleneck."
>>
>> I've took also a snapshot of the Opcenter right after the stress test,
>> you can find it here: http://i.imgur.com/3dSlUCE.png and
>> http://i.imgur.com/QWBpxTn.png
>>
>> I'm asking of what settings I could try to change to improve write
>> throughput, remove or limit blocking/iowait. My first guess is to increase
>> the compaction_throughput_mb_per_sec, do you think I am on the right way?
>> Have you any other suggestions?
>>
>> Thank you in advance for any kind of feedback
>> Giampaolo
>>
>
>

Re: Understanding Cassandra tuning

Reply via email to