Yes, RF=1 as https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml and there you can find also my stress schema. Any case, with a so low throughput I think I should move to other type of AWS instances before repeating the test and investigate further on tuning.
Thanks, Giampaolo 2016-03-25 17:11 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>: > Your IOPS and throughput seem to be below the AWS limits, but... I wonder > if replication is doubling those numbers and then a little write > amplification may then bump you into the AWS limits. > > See: > > http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html > > What RF are you using? RF=2? > > What does your schema look like... are you splitting your multi-GB blobs > into individual 100K rows? > > Oh, wait... the AWS instance type page says that m4.xlarge has a Dedicated > EBS Throughput (Mbps) rate of only 750. I presume "b" means bit, not byte, > so that's really 750 /8 = 93.75 MB/sec, which is fairly close to your > numbers, so just a little write amplification or spiking or fuzzy math on > AWS end might trigger some AWS throttling. > > > > -- Jack Krupansky > > On Fri, Mar 25, 2016 at 11:42 AM, Giampaolo Trapasso < > giampaolo.trapa...@radicalbit.io> wrote: > >> Hi to all, >> >> I want to understand better Cassandra 2.2.5 tuning for my app (and C* >> tuning in general). In the app I'm developing, the typical scenario is the >> upload on cluster of a large binary file (in the order of GBs). Long story >> short, after many failed try to get specific upload throughput with my >> custom code, I've decided to start from the beginning using stress test on >> an almost default configured cluster. >> >> The cluster is, of course, a development one and is made up by two EC2 >> nodes that are m4.xlarge (EBS optimized). The meaning of this "exercise" is >> to understand if/how much are these two instances underused, what are the >> most important tunings to do and of course, gain enough experience to >> proceed toward production. I do not want to increase cluster size unless >> there's no room for improvement with current setup. >> >> Here is the stress.yaml >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml> >> and the configuration of one of the two nodes I'm using (Cassandra-env.sh >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-env-sh>, >> cassandra.yaml >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-yaml>). >> The binarydata table mimics the way I intend upload data that is as chucks >> of ~100K bytes at time. This number is a setting of the application I want >> to develop, but for the moment I want to keep it fixed. >> >> I've read the most comprehensive guide on tuning for 2.2 that is >> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. >> It's a quite big topic and surely I scratched only the surface. >> >> >> I ran the stress tool and here you are my basic understandings: >> Looking at stress results >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-stress-output> >> 1. Since I write 100k at each insert, running with 4 threadCount at 783 >> ops/s tells me that my write rate is ~78MB/s >> 2. After 32826 inserts, I begin to have write timeouts (I set 5 seconds >> on configuration), >> Looking at dstat >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-dstat-output> >> 3. I have some blocked process. How can I spot the Java thread that is >> currently blocked? Nodetool TPStats output >> <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-tpstats-output> >> at the end of the stress test has blocked column at 0 for all TPs, there's >> some tool to check the situation while/after the test is running beside the >> Opcenter? >> 4. From --dsk/total-- column I see <16MB in read (I suppose the compactor >> running) and 89MB of max write. I do not understand if this is the limit of >> the node itself (cannot provide more than that data to disk) or the disk >> itself. As Opcenter, this seems the limit of the machine itself. >> 5. A lot of iowait in the ----total-cpu-usage----, Al Tobey says: "Any >> blocked processes is considered bad and you should immediately look at the >> iowait %. 1-2% iowait isn't necessarily a problem, but it usually points at >> storage as a bottleneck." >> >> I've took also a snapshot of the Opcenter right after the stress test, >> you can find it here: http://i.imgur.com/3dSlUCE.png and >> http://i.imgur.com/QWBpxTn.png >> >> I'm asking of what settings I could try to change to improve write >> throughput, remove or limit blocking/iowait. My first guess is to increase >> the compaction_throughput_mb_per_sec, do you think I am on the right way? >> Have you any other suggestions? >> >> Thank you in advance for any kind of feedback >> Giampaolo >> > >