Jaydeep, thanks for reaching out and for sharing this proposal. I like the direction. Please also take a look at
https://issues.apache.org/jira/browse/CASSANDRA-16663 , which adds coordinator-level rate limiting on request rate. This ticket
introduced a lockless rate limiter patterned on an approach in Guava, adapted for Cassandra. This will likely be the rate
limiter you'd like to actuate via changes in the database metrics your proposal describes. One challenge we've had with the
rate limiter introduced in C-16663 is that it's difficult to determine what value to set it to. Another aspect of C-16663 that
I find valuable is that the rate limiter can dispatch notifications to clients that rate limiting is taking effect, implemented
via client warnings. Users should/need to be able to know that their request rate is being artificially reduced due to
server-side throttling. Appreciate your thinking along the lines of different metrics that operators may want to trigger on, as
this will vary per-workload. There are probably extremely read-heavy workloads that push CPU hard but which don't incur
compaction backlog/other balance-of-humors issues and are comfortable running at high CPU, as an example. I like the idea of
adaptive request volumes in clients as well and don't see this and Jon's proposal as mutually exclusive. There are many
different Cassandra clients, including some outside the project written in Go and Rust that are beyond the scope of our direct
control. Any server should have mechanisms to defend itself against any client; official or non–. Jon's also right that an
implementation in the Spark Cassandra Connector may help jobs make progress at the maximum safe pace. Thanks for sharing this
thoughtful document. – Scott On Jan 16, 2024, at 10:46 AM, Jon Haddad <j...@jonhaddad.com> wrote: Server side rate
limiting can be useful, but imo if we were to focus effort into a single place, time would be much better spent adding adaptive
rate limiting to the drivers. Rate limiting at the driver level can be done based on 2 simple feedback mechanisms - error rate
and latency. When a node is throwing errors (or exceeds the defined latency SLO), requests to that node can be rate limited. It
does a better job of preventing issues than server side rate limiting as we don't get the DDOS effect in addition to whatever
issue the node is dealing with at the time. Netflix has a good post on it here [1], and I've incorporated the latency version
into my fork of tlp-stress [2] for benchmarking. Adding this to the driver means the DS Spark Connector can also take advantage
of it, which is nice because tuning it to get the optimal throughput is a bit of a headache. Regarding the server side, I think
the proposal to use various system metrics is overly complicated. The only metrics that matter are latency and error rate. As
long as you're within acceptable thresholds, you don't need to rate limit. Jon [1]
https://netflixtechblog.medium.com/performance-under-load-3e6fa9a60581 [2]
https://rustyrazorblade.com/post/2023/2023-10-31-tlp-stress-adaptive-scheduler/ On Tue, Jan 16, 2024 at 10:02 AM Štefan
Miklošovič < stefan.mikloso...@gmail.com > wrote: Hi Jaydeep, That seems quite interesting. Couple points though: 1) It
would be nice if there is a way to "subscribe" to decisions your detection framework comes up with. Integration with
e.g. diagnostics subsystem would be beneficial. This should be pluggable - just coding up an interface to dump / react on the
decisions how I want. This might also act as a notifier to other systems, e-mail, slack channels ... 2) Have you tried to
incorporate this with the Guardrails framework? I think that if something is detected to be throttled or rejected (e.g writing
to a table), there might be a guardrail which would be triggered dynamically in runtime. Guardrails are useful as such but here
we might reuse them so we do not need to code it twice. 3) I am curious how complex this detection framework would be, it can
be complicated pretty fast I guess. What would be desirable is to act on it in such a way that you will not put that node under
even more pressure. In other words, your detection system should work in such a way that there will not be any "doom
loop" whereby mere throttling of various parts of Cassandra you make it even worse for other nodes in the cluster. For
example, if a particular node starts to be overwhelmed and you detect this and requests start to be rejected, is it not
possible that Java driver would start to see this node as "erroneous" with delayed response time etc and it would
start to prefer other nodes in the cluster when deciding what node to contact for query coordination? So you would put more
load on other nodes, making them more susceptible to be throttled as well ... Regards Stefan Miklosovic On Tue, Jan 16, 2024 at
6:41 PM Jaydeep Chovatia < chovatia.jayd...@gmail.com > wrote: Hi, Happy New Year! I would like to discuss the following
idea: Open-source Cassandra ( CASSANDRA-15013 ) has an elementary built-in memory rate limiter based on the incoming payload
from user requests. This rate limiter activates if any incoming user request’s payload exceeds certain thresholds. However, the
existing rate limiter only solves limited-scope issues. Cassandra's server-side meltdown due to overload is a known problem.
Often we see that a couple of busy nodes take down the entire Cassandra ring due to the ripple effect. The following document
proposes a generic purpose comprehensive rate limiter that works considering system signals, such as CPU, and internal signals,
such as thread pools. The rate limiter will have knobs to filter out internal traffic, system traffic, replication traffic, and
furthermore based on the types of queries. More design details to this doc: [OSS] Cassandra Generic Purpose Rate Limiter -
Google Docs Please let me know your thoughts. Jaydeep