> Do folks think we should file an official CEP and take it there?
+1 here.

Synthesizing your gdoc, Caleb's work, and the feedback from this thread into a 
draft seems like a solid next step.

On Wed, Feb 7, 2024, at 12:31 PM, Jaydeep Chovatia wrote:
> I see a lot of great ideas being discussed or proposed in the past to cover 
> the most common rate limiter candidate use cases. Do folks think we should 
> file an official CEP and take it there?
> 
> Jaydeep
> 
> On Fri, Feb 2, 2024 at 8:30 AM Caleb Rackliffe <calebrackli...@gmail.com> 
> wrote:
>> I just remembered the other day that I had done a quick writeup on the state 
>> of compaction stress-related throttling in the project:
>> 
>> https://docs.google.com/document/d/1dfTEcKVidRKC1EWu3SO1kE1iVLMdaJ9uY1WMpS3P_hs/edit?usp=sharing
>> 
>> I'm sure most of it is old news to the people on this thread, but I figured 
>> I'd post it just in case :)
>> 
>> On Tue, Jan 30, 2024 at 11:58 AM Josh McKenzie <jmcken...@apache.org> wrote:
>>> __
>>>> 2.) We should make sure the links between the "known" root causes of 
>>>> cascading failures and the mechanisms we introduce to avoid them remain 
>>>> very strong.
>>> Seems to me that our historical strategy was to address individual known 
>>> cases one-by-one rather than looking for a more holistic load-balancing and 
>>> load-shedding solution. While the engineer in me likes the elegance of a 
>>> broad, more-inclusive *actual SEDA-like* approach, the pragmatist in me 
>>> wonders how far we think we are today from a stable set-point.
>>> 
>>> i.e. are we facing a handful of cases where nodes can still get pushed over 
>>> and then cascade that we can surgically address, or are we facing a broader 
>>> lack of back-pressure that rears its head in different domains (client -> 
>>> coordinator, coordinator -> replica, internode with other operations, etc) 
>>> at surprising times and should be considered more holistically?
>>> 
>>> On Tue, Jan 30, 2024, at 12:31 AM, Caleb Rackliffe wrote:
>>>> I almost forgot CASSANDRA-15817, which introduced 
>>>> reject_repair_compaction_threshold, which provides a mechanism to stop 
>>>> repairs while compaction is underwater.
>>>> 
>>>>> On Jan 26, 2024, at 6:22 PM, Caleb Rackliffe <calebrackli...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> Hey all,
>>>>> 
>>>>> I'm a bit late to the discussion. I see that we've already discussed 
>>>>> CASSANDRA-15013 <https://issues.apache.org/jira/browse/CASSANDRA-15013> 
>>>>> and CASSANDRA-16663 
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-16663> at least in 
>>>>> passing. Having written the latter, I'd be the first to admit it's a 
>>>>> crude tool, although it's been useful here and there, and provides a 
>>>>> couple primitives that may be useful for future work. As Scott mentions, 
>>>>> while it is configurable at runtime, it is not adaptive, although we did 
>>>>> make configuration easier in CASSANDRA-17423 
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17423>. It also is 
>>>>> global to the node, although we've lightly discussed some ideas around 
>>>>> making it more granular. (For example, keyspace-based limiting, or 
>>>>> limiting "domains" tagged by the client in requests, could be 
>>>>> interesting.) It also does not deal with inter-node traffic, of course.
>>>>> 
>>>>> Something we've not yet mentioned (that does address internode traffic) 
>>>>> is CASSANDRA-17324 
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17324>, which I proposed 
>>>>> shortly after working on the native request limiter (and have just not 
>>>>> had much time to return to). The basic idea is this:
>>>>> 
>>>>>> When a node is struggling under the weight of a compaction backlog and 
>>>>>> becomes a cause of increased read latency for clients, we have two 
>>>>>> safety valves:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 1.) Disabling the native protocol server, which stops the node from 
>>>>>> coordinating reads and writes.
>>>>>> 2.) Jacking up the severity on the node, which tells the dynamic snitch 
>>>>>> to avoid the node for reads from other coordinators.
>>>>>> 
>>>>>> 
>>>>>> These are useful, but we don’t appear to have any mechanism that would 
>>>>>> allow us to temporarily reject internode hint, batch, and mutation 
>>>>>> messages that could further delay resolution of the compaction backlog.
>>>>>> 
>>>>> 
>>>>> Whether it's done as part of a larger framework or on its own, it still 
>>>>> feels like a good idea.
>>>>> 
>>>>> Thinking in terms of opportunity costs here (i.e. where we spend our 
>>>>> finite engineering time to holistically improve the experience of 
>>>>> operating this database) is healthy, but we probably haven't reached the 
>>>>> point of diminishing returns on nodes being able to protect themselves 
>>>>> from clients and from other nodes. I would just keep in mind two things:
>>>>> 
>>>>> 1.) The effectiveness of rate-limiting in the system (which includes the 
>>>>> database and all clients) as a whole necessarily decreases as we move 
>>>>> from the application to the lowest-level database internals. Limiting 
>>>>> correctly at the client will save more resources than limiting at the 
>>>>> native protocol server, and limiting correctly at the native protocol 
>>>>> server will save more resources than limiting after we've dispatched 
>>>>> requests to some thread pool for processing.
>>>>> 2.) We should make sure the links between the "known" root causes of 
>>>>> cascading failures and the mechanisms we introduce to avoid them remain 
>>>>> very strong.
>>>>> 
>>>>> In any case, I'd be happy to help out in any way I can as this moves 
>>>>> forward (especially as it relates to our past/current attempts to address 
>>>>> this problem space).
>>> 

Reply via email to