Cassandra internal bottleneck
Hi, I'm trying to evaluate performance of Apache Cassandra V4.0.1 for write-only workloads using on-premise physical servers. On a single node cluster, doing some optimizations i was able to make CPU of node >90%, throughput is high enough and CPU is the bottleneck as i expected. Then doing the same benchmark on a cluster with two nodes/RF=2/CL=ALL, the throughput decreased by 20% in compare with single node scenario but CPU usage on both nodes is about 70%(changing from 50% to 90% over and over again each 5-6 seconds), i wonder how could i make CPU usage >90% steadily in this scenario so reach maximum throughput of my Hardware (resources other than CPU being used by less than 10% of their capability) >From jvisualvm: There are only 90 Native-Transport threads mostly waiting and >31 Mutation stage threads also mostly waiting, the only threads that are >always running are Messaging-EventLoop (6 threads) and epollEventLoop (40 >threads). Where is the bottleneck of the cluster now? How can i increase its resources to again reach 90% CPU and max write throughput? How can i debug pipeline of SEDA architecture of Cassandra to find such bottlenecks? Best Regards Sent using https://www.zoho.com/mail/
Re: Cassandra internal bottleneck
How many clients do you have sending write requests? In several cases I've worked on, the bottleneck is on the client side. Try increasing the number of app instances and you might find that the combined throughput increases significantly. Cheers!
Fwd: Re: Cassandra internal bottleneck
Thanks, I've got only one client, 10 threads and 1K async writes, This single client was able to send 110K insert/seconds to single node cluster but its only sending 90K insert/seconds to the cluster with 2 nodes(client CPU/network usage is less than 20%) Sent using https://www.zoho.com/mail/ Forwarded message From: Erick Ramirez To: Date: Sat, 05 Feb 2022 13:25:23 +0330 Subject: Re: Cassandra internal bottleneck Forwarded message How many clients do you have sending write requests? In several cases I've worked on, the bottleneck is on the client side. Try increasing the number of app instances and you might find that the combined throughput increases significantly. Cheers!
TLS/SSL overhead
Hi, Anyone measured impact of wire encryption using TLS (client_encryption/server_encryption) on cluster latency/throughput? It may be dependent on Hardware or even data model but I already did some sort of measurements and got to 2% for client encryption and 3-5% for client + server encryption and wanted to validate that with community. Best Regards Sent using https://www.zoho.com/mail/
Re: TLS/SSL overhead
The 3-5% penalty range is consistent with what other users have reported over the years but I'm sorry that I can't seem to find the threads/references so my response is unfortunately anecdotal. More importantly, would you be interested in sharing your data? It would be great to feature it as a blog post and I'm sure a lot of users are going to be very interested. It doesn't have to be a polished write up and we've got other contributors who'd be happy to help with the draft if that's a concern. Cheers!