Your IOPS and throughput seem to be below the AWS limits, but... I wonder
if replication is doubling those numbers and then a little write
amplification may then bump you into the AWS limits.

See:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html

What RF are you using? RF=2?

What does your schema look like... are you splitting your multi-GB blobs
into individual 100K rows?

Oh, wait... the AWS instance type page says that m4.xlarge has a Dedicated
EBS Throughput (Mbps) rate of only 750. I presume "b" means bit, not byte,
so that's really 750 /8 = 93.75 MB/sec, which is fairly close to your
numbers, so just a little write amplification or spiking or fuzzy math on
AWS end might trigger some AWS throttling.



-- Jack Krupansky

On Fri, Mar 25, 2016 at 11:42 AM, Giampaolo Trapasso <
giampaolo.trapa...@radicalbit.io> wrote:

> Hi to all,
>
> I want to understand better Cassandra 2.2.5 tuning for my app (and C*
> tuning in general). In the app I'm developing, the typical scenario is the
> upload on cluster of a large binary file (in the order of GBs). Long story
> short, after many failed try to get specific upload throughput with my
> custom code, I've decided to start from the beginning using stress test on
> an almost default configured cluster.
>
> The cluster is, of course, a development one and is made up by two EC2
> nodes that are m4.xlarge (EBS optimized). The meaning of this "exercise" is
> to understand if/how much are these two instances underused, what are the
> most important tunings to do and of course, gain enough experience to
> proceed toward production. I do not want to increase cluster size unless
> there's no room for improvement with current setup.
>
> Here is the stress.yaml
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml>
> and the configuration of one of the two nodes I'm using (Cassandra-env.sh
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-env-sh>,
> cassandra.yaml
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-yaml>).
> The binarydata table mimics the way I intend upload data that is as chucks
> of ~100K bytes at time. This number is a setting of the application I want
> to develop, but for the moment I want to keep it fixed.
>
> I've read the most comprehensive guide on tuning for 2.2 that is
> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html.
> It's a quite big topic and surely I scratched only the surface.
>
>
> I ran the stress tool and here you are my basic understandings:
> Looking at stress results
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-stress-output>
> 1. Since I write 100k at each insert, running with 4 threadCount at 783
> ops/s tells me that my write rate is ~78MB/s
> 2. After 32826 inserts, I begin to have write timeouts (I set 5 seconds on
> configuration),
> Looking at dstat
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-dstat-output>
> 3. I have some blocked process. How can I spot the Java thread that is
> currently blocked? Nodetool TPStats output
> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-tpstats-output>
> at the end of the stress test has blocked column at 0 for all TPs, there's
> some tool to check the situation while/after the test is running beside the
> Opcenter?
> 4. From --dsk/total-- column I see <16MB in read (I suppose the compactor
> running) and 89MB of max write. I do not understand if this is the limit of
> the node itself (cannot provide more than that data to disk) or the disk
> itself. As Opcenter, this seems the limit of the machine itself.
> 5. A lot of iowait in the ----total-cpu-usage----, Al Tobey says: "Any
> blocked processes is considered bad and you should immediately look at the
> iowait %. 1-2% iowait isn't necessarily a problem, but it usually points at
> storage as a bottleneck."
>
> I've took also a snapshot of the Opcenter right after the stress test, you
> can find it here: http://i.imgur.com/3dSlUCE.png and
> http://i.imgur.com/QWBpxTn.png
>
> I'm asking of what settings I could try to change to improve write
> throughput, remove or limit blocking/iowait. My first guess is to increase
> the compaction_throughput_mb_per_sec, do you think I am on the right way?
> Have you any other suggestions?
>
> Thank you in advance for any kind of feedback
> Giampaolo
>

Reply via email to