Hi Alain, Thank you for your reply. I have run the same test but writing into a non-counter table equivalent to the counter one. The rate in this case is around 160k writes/second. I am not sure if I should be expecting much more in a 3 nodes cluster. In terms of I/O is not much really and the system is not reaching its limit in any are but am still to look closer.
I will monitor the CPU, Disk and Network usage when I come back to the system to try and find the weakest link in the chain. Regarding the configurations of the nodes regarding counters, here are a few settings in case anyone can see something wrong: + counter_cache_size_in_mb: [LEFT EMPTY] => It will use 50MB in my case + counter_cache_save_period: 7200 + concurrent_reads: 112 + concurrent_writes: 128 + concurrent_counter_writes: 112 I found the following using nodetool info. Looks like the counter cache is full so I will try increasing it. Gossip active : true Thrift active : false Native Transport active: true Load : 69.02 GiB Generation No : 1519044361 Uptime (seconds) : 987326 Heap Memory (MB) : 4653.43 / 8032.00 Off Heap Memory (MB) : 361.52 Data Center : dc1 Rack : rack1 Exceptions : 634 Key Cache : entries 1297379, size 100 MiB, capacity 100 MiB, 3123293769 hits, 5261224916 requests, 0.594 recent hit rate, 14400 save period in seconds Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Counter Cache : entries 436906, size 50 MiB, capacity 50 MiB, 478644319 hits, 7183133047 requests, 0.067 recent hit rate, 7200 save period in seconds Chunk Cache : entries 7680, size 480 MiB, capacity 480 MiB, 4635329524 misses, 6673522516 requests, 0.305 recent hit rate, NaN microseconds miss latency Percent Repaired : 0.0% Regards, Javier F Javier Pareja On Fri, Mar 2, 2018 at 7:01 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > Hi Javier, > > The only bottleneck in the writes as far as I understand it is the commit >> log. >> > > Sadly this is somewhat wrong, specially in your case. CPU, network limits > can be reached, and other issues, can happen. Plus in your case, using > counters, there is way more things involved. > > The 2 main things that I saw from your comment are: > > - You are using counters. When writing a counter, Apache Cassandra > performs a read-before-write. In 3.11.1 there should be a counter cache > that you can play with to alleviate impacts of this, but what you want to > do with counters generally is to put a buffer somewhere that will count for > and send one request of '+5000' instead of 5000 thousands request of +1. > The difference should be substencial. > - Given this first consideration, and in general, using HDD is not the > best to have good throughput, and makes it almost impossible to reach > something close to the ms latency. Having 9 disks will not make each of > them faster, but allow more concurrency, so latency will never be 'the > best'. > > Before making changes in the hardware it is important to understand where > the bottleneck comes from: > Ideally, I often recommend dashboards, they allow to spot this kind of > things very well. > If no dashboards are available, maybe the logs (specially warn / error / > gc) could help, or commands such as 'nodetool cfstats' or 'nodetool > tpstats' to build a better understanding. > > If the machines and disks can handle it and you want to try it as it is, > maybe try to increase the amount of 'concurrent_counter' in cassandra.yaml > or increase the cache size, but I am really guessing here. > > - I have configured all 3 nodes to act as seeds but I don't think this >> affects write performance. >> > > No problem > > - The hints_directory and the saved_caches_directory use the same drive as >> the commitlog_directory. The data is in the other 7 drives as I explained >> earlier. > > > It should be good, did you check the disks performances / usage? > > Could the saved_cached, specially because of the counters, have a >> meaningful impact on the write performance? > > > It depends on the frequency of the cache being written to the disk is and > the size of it. > > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2018-03-02 17:36 GMT+00:00 Javier Pareja <pareja.jav...@gmail.com>: > >> Hi again, >> >> Two more thoughts with respect to my question: >> - I have configured all 3 nodes to act as seeds but I don't think this >> affects write performance. >> - The hints_directory and the saved_caches_directory use the same drive >> as the commitlog_directory. The data is in the other 7 drives as I >> explained earlier. Could the saved_cached, specially because of the >> counters, have a meaningful impact on the write performance? >> - If more nodes are needed for whatever reason, would a layer of >> virtualization on top of each machine help. Each virtual machine will have >> assigned dedicated drives (there are plenty of them) and only share the CPU >> and RAM. >> >> The only bottleneck in the writes as far as I understand it is the commit >> log. Shall I create RAID0 (for speed) or install an SSD just for the >> commitlog? >> >> Thanks, >> Javier >> >> >> F Javier Pareja >> >> On Fri, Mar 2, 2018 at 12:21 PM, Javier Pareja <pareja.jav...@gmail.com> >> wrote: >> >>> Hello everyone, >>> >>> I have configured a Cassandra cluster with 3 nodes, however I am not >>> getting the write speed that I was expecting. I have tested against a >>> counter table because it is the bottleneck of the system. >>> So with the system iddle I run the attached sample code (very simple >>> async writes with a throttle) against an schema with RF=2 and a table with >>> SizeTieredCompactationStrategy. >>> >>> The speeds that I get are around 65k updates-writes/second and I was >>> hoping for at least 150k updates-writes/second. Even if I run the test >>> in 2 machines in parallel, the execution is 35k updates-writes/second >>> in each. I have executed the test in the nodes themselves (1 and 2 of the 3 >>> nodes). >>> >>> The nodes are fairly powerful. Each has the following configuration >>> running Cassandra 3.11.1 >>> - RAM: 256GB >>> - HDD Disks: 9 (7 configured for cassandra data, 1 for the OS and 1 >>> configured for cassandra commits) >>> - CPU: 8 processors with hyperthreading => 16 processors >>> >>> The RAM, CPU and HDDs are far from being maxed out when running the >>> tests. >>> >>> The test command line class uses two parameters: max executions and >>> parallelism. Parallelism is the max number of AsyncExecutions running in >>> parallel. Any other execution will have to wait for available slots. >>> I tried increasing the parallelism (64, 128, 256...) but the results >>> are the same, 128 seems enough. >>> >>> Table definition: >>> >>> CREATE TABLE counttest ( >>> key_column bigint, >>> cluster_column int, >>> count1_column counter, >>> count2_column counter, >>> count3_column counter, >>> count4_column counter, >>> count5_column counter, >>> PRIMARY KEY ((key_column),cluster_column) >>> ); >>> >>> >>> Write test data generation (from the class attached). Each insert is >>> prepared with uniform random values from below: >>> long key_column = getRandom(0, 5000000); >>> int cluster_column = getRandom(0, 4096); >>> long count1_column = getRandom(0, 10); >>> long count2_column = getRandom(0, 10); >>> long count3_column = getRandom(0, 10); >>> long count4_column = getRandom(0, 10); >>> long count5_column = getRandom(0, 10); >>> >>> >>> *I suspect that we took the wrong approach when designing the hardware: >>> Should we have used more nodes and less drives per node? If this is the >>> case, I am trying to understand why or if there is any change that we could >>> do to the configuration (other than getting more nodes) to improve that.* >>> >>> Will an SSD dedicated for the commit log improve things dramatically? >>> >>> >>> Best Regards, >>> Javier >>> >>> >> >