I think this discussion is getting away from a manageable scope of work... On Thu, Jan 2, 2025 at 1:09 PM Jim Nasby <jna...@upgrade.com> wrote:
> That’s basically my argument for having workload management. If a system > becomes loaded enough for the global limit to start kicking in it’s likely > that query response time is increasing, which means you will soon have more > and more active backends trying to run queries. That’s just going to make the > situation even worse. You’d either have to start trying to “take memory away” > from already running backends or backends that are just starting would have > such a low limit as to cause them to spill very quickly, creating further > load on the system. I think this is backward. There's a fixed number of backends, limited by "max_connections." A given backend becoming slow to respond doesn't increase this limit. The load on the system is limited by "max_connections." This makes the system stable. If, instead, the system starts cancelling queries, then a naive client will typically just retry the query. If the reason the query was cancelled is that the system is overloaded, then the query is likely to be cancelled again, leading to a retry storm, even though the number of backends is limited by "max_connections," since a given backend will be repeatedly cancelled. Long-term, the solution to, "I don't have enough resources to run my workload efficiency," is either: * Buy more resources; or * Reduce your workload. We can't help with either of those solutions! However, the (short-term) problem I'd like to solve is: how do we expose efficient use of resources, in those cases where we *do* have enough resources to run a workload efficiently, but the existing "work_mem" and "hash_mem_multiplier" GUCs are insufficient? Thanks, James