Thank you Jürgen, The default consistency in the library in already ONE. I tried setting it anyways but it made no difference. Hopefully it is a configuration issue, that would be very good news!! Do you have any past/present experience with large counter tables?
F Javier Pareja On Fri, Mar 2, 2018 at 6:37 PM, Jürgen Albersdorfer <jalbersdor...@gmail.com > wrote: > As far as I have seen, you have not configured outbound consistency which > defaults to Local_Quorum. Try with ONE. Then there might still be a > configurstion issue. Concurrent compactors maybe or ressource contention on > cpu with the Test Code. > > Von meinem iPhone gesendet > > Am 02.03.2018 um 18:36 schrieb Javier Pareja <pareja.jav...@gmail.com>: > > Hi again, > > Two more thoughts with respect to my question: > - I have configured all 3 nodes to act as seeds but I don't think this > affects write performance. > - The hints_directory and the saved_caches_directory use the same drive as > the commitlog_directory. The data is in the other 7 drives as I explained > earlier. Could the saved_cached, specially because of the counters, have a > meaningful impact on the write performance? > - If more nodes are needed for whatever reason, would a layer of > virtualization on top of each machine help. Each virtual machine will have > assigned dedicated drives (there are plenty of them) and only share the CPU > and RAM. > > The only bottleneck in the writes as far as I understand it is the commit > log. Shall I create RAID0 (for speed) or install an SSD just for the > commitlog? > > Thanks, > Javier > > > F Javier Pareja > > On Fri, Mar 2, 2018 at 12:21 PM, Javier Pareja <pareja.jav...@gmail.com> > wrote: > >> Hello everyone, >> >> I have configured a Cassandra cluster with 3 nodes, however I am not >> getting the write speed that I was expecting. I have tested against a >> counter table because it is the bottleneck of the system. >> So with the system iddle I run the attached sample code (very simple >> async writes with a throttle) against an schema with RF=2 and a table with >> SizeTieredCompactationStrategy. >> >> The speeds that I get are around 65k updates-writes/second and I was >> hoping for at least 150k updates-writes/second. Even if I run the test >> in 2 machines in parallel, the execution is 35k updates-writes/second in >> each. I have executed the test in the nodes themselves (1 and 2 of the 3 >> nodes). >> >> The nodes are fairly powerful. Each has the following configuration >> running Cassandra 3.11.1 >> - RAM: 256GB >> - HDD Disks: 9 (7 configured for cassandra data, 1 for the OS and 1 >> configured for cassandra commits) >> - CPU: 8 processors with hyperthreading => 16 processors >> >> The RAM, CPU and HDDs are far from being maxed out when running the tests. >> >> The test command line class uses two parameters: max executions and >> parallelism. Parallelism is the max number of AsyncExecutions running in >> parallel. Any other execution will have to wait for available slots. >> I tried increasing the parallelism (64, 128, 256...) but the results are >> the same, 128 seems enough. >> >> Table definition: >> >> CREATE TABLE counttest ( >> key_column bigint, >> cluster_column int, >> count1_column counter, >> count2_column counter, >> count3_column counter, >> count4_column counter, >> count5_column counter, >> PRIMARY KEY ((key_column),cluster_column) >> ); >> >> >> Write test data generation (from the class attached). Each insert is >> prepared with uniform random values from below: >> long key_column = getRandom(0, 5000000); >> int cluster_column = getRandom(0, 4096); >> long count1_column = getRandom(0, 10); >> long count2_column = getRandom(0, 10); >> long count3_column = getRandom(0, 10); >> long count4_column = getRandom(0, 10); >> long count5_column = getRandom(0, 10); >> >> >> *I suspect that we took the wrong approach when designing the hardware: >> Should we have used more nodes and less drives per node? If this is the >> case, I am trying to understand why or if there is any change that we could >> do to the configuration (other than getting more nodes) to improve that.* >> >> Will an SSD dedicated for the commit log improve things dramatically? >> >> >> Best Regards, >> Javier >> >> >