Hi to all, I want to understand better Cassandra 2.2.5 tuning for my app (and C* tuning in general). In the app I'm developing, the typical scenario is the upload on cluster of a large binary file (in the order of GBs). Long story short, after many failed try to get specific upload throughput with my custom code, I've decided to start from the beginning using stress test on an almost default configured cluster.
The cluster is, of course, a development one and is made up by two EC2 nodes that are m4.xlarge (EBS optimized). The meaning of this "exercise" is to understand if/how much are these two instances underused, what are the most important tunings to do and of course, gain enough experience to proceed toward production. I do not want to increase cluster size unless there's no room for improvement with current setup. Here is the stress.yaml <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-stress-yaml> and the configuration of one of the two nodes I'm using (Cassandra-env.sh <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-env-sh>, cassandra.yaml <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-yaml>). The binarydata table mimics the way I intend upload data that is as chucks of ~100K bytes at time. This number is a setting of the application I want to develop, but for the moment I want to keep it fixed. I've read the most comprehensive guide on tuning for 2.2 that is https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. It's a quite big topic and surely I scratched only the surface. I ran the stress tool and here you are my basic understandings: Looking at stress results <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-cassandra-stress-output> 1. Since I write 100k at each insert, running with 4 threadCount at 783 ops/s tells me that my write rate is ~78MB/s 2. After 32826 inserts, I begin to have write timeouts (I set 5 seconds on configuration), Looking at dstat <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-dstat-output> 3. I have some blocked process. How can I spot the Java thread that is currently blocked? Nodetool TPStats output <https://gist.github.com/giampaolotrapasso/9f0242fc60144ada458c#file-tpstats-output> at the end of the stress test has blocked column at 0 for all TPs, there's some tool to check the situation while/after the test is running beside the Opcenter? 4. From --dsk/total-- column I see <16MB in read (I suppose the compactor running) and 89MB of max write. I do not understand if this is the limit of the node itself (cannot provide more than that data to disk) or the disk itself. As Opcenter, this seems the limit of the machine itself. 5. A lot of iowait in the ----total-cpu-usage----, Al Tobey says: "Any blocked processes is considered bad and you should immediately look at the iowait %. 1-2% iowait isn't necessarily a problem, but it usually points at storage as a bottleneck." I've took also a snapshot of the Opcenter right after the stress test, you can find it here: http://i.imgur.com/3dSlUCE.png and http://i.imgur.com/QWBpxTn.png I'm asking of what settings I could try to change to improve write throughput, remove or limit blocking/iowait. My first guess is to increase the compaction_throughput_mb_per_sec, do you think I am on the right way? Have you any other suggestions? Thank you in advance for any kind of feedback Giampaolo