On Dec 30, 2024, at 7:05 PM, James Hunter <james.hunter...@gmail.com> wrote:
> 
> On Sat, Dec 28, 2024 at 11:24 PM Jim Nasby <jna...@upgrade.com> wrote:
>> 
>> IMHO none of this will be very sane until we actually have cluster-level 
>> limits. One sudden burst in active connections and you still OOM the 
>> instance.
> 
> Fwiw, PG does support "max_connections" GUC, so a backend/connection -
> level limit, times "max_connections", yields a cluster-level limit.

max_connections is useless here, for two reasons:

1. Changing it requires a restart. That’s at *best* a real PITA in production. 
[1]
2. It still doesn’t solve the actual problem. Unless your workload *and* your 
data are extremely homogeneous you can’t simply limit the number of connections 
and call it a day. A slight change in incoming queries, OR in the data that the 
queries are looking at and you go from running fine to meltdown. You don’t even 
need a plan flip for this to happen, just the same plan run at the same rate 
but now accessing more data than before.

Most of what I’ve seen on this thread is discussing ways to *optimize* how much 
memory the set of running backends can consume. Adjusting how you slice the 
memory pie across backends, or even within a single backend, is optimization. 
While that’s a great goal that I do support, it will never fully fix the 
problem. At some point you need to either throw your hands in the air and start 
tossing memory errors, because you don’t have control over how much work is 
being thrown at the engine. The only way that the engine can exert control over 
that would be to hold new transactions from starting when the system is under 
duress (ie, workload management). While workload managers can be quite 
sophisticated (aka, complex), the nice thing about limiting this scope to 
work_mem, and only as a means to prevent complete overload, is that the problem 
becomes a lot simpler since you’re only looking at one metric and not trying to 
support any kind of priority system. The only fanciness I think an MVP would 
need is a GUC to control how long a transaction can sit waiting before it 
throws an error. Frankly, that sounds a lot less complex and much easier for 
DBAs to adjust than trying to teach the planner how to apportion out per-node 
work_mem limits.

As I said, I’m not opposed to optimizations, I just think they’re very much 
cart-before-the-horse.

1: While it’d be a lot of work to make max_connections dynamic one thing we 
could do fairly easily would be to introduce another GUC (max_backends?) that 
actually controls the total number of allowed backends for everything. The sum 
of max_backends + autovac workers + background workers + whatever else I’m 
forgetting would have to be less than that. The idea here is that you’d 
normally run with max_connections set significantly lower than max_backends. 
That means that if you need to adjust any of these GUCs (other than 
max_backends) you don’t need to restart - the new limits would just apply to 
new connection requests.

Reply via email to