Re: Disable parallel query by default

Scott Mead Mon, 14 Jul 2025 11:21:31 -0700


On Wed, May 21, 2025, at 10:55 AM, Scott Mead wrote:
> 
> 
> On Wed, May 21, 2025, at 3:50 AM, Laurenz Albe wrote:
> > On Tue, 2025-05-20 at 16:58 -0400, Scott Mead wrote:
> > > On Wed, May 14, 2025, at 4:06 AM, Laurenz Albe wrote:
> > > > On Tue, 2025-05-13 at 17:53 -0400, Scott Mead wrote:
> > > > > On Tue, May 13, 2025, at 5:07 PM, Greg Sabino Mullane wrote:
> > > > > > On Tue, May 13, 2025 at 4:37 PM Scott Mead <sc...@meads.us> wrote:
> > > > > > > I'll open by proposing that we prevent the planner from 
> > > > > > > automatically
> > > > > > > selecting parallel plans by default
> > > > 
> > > > > > > What is the fallout?  When a high-volume, low-latency query flips 
> > > > > > > to
> > > > > > > parallel execution on a busy system, we end up in a situation 
> > > > > > > where
> > > > > > > the database is effectively DDOSing itself with a very high rate 
> > > > > > > of
> > > > > > > connection establish and tear-down requests.
> > > > 
> > > > You are painting a bleak picture indeed.  I get to see PostgreSQL 
> > > > databases
> > > > in trouble regularly, but I have not seen anything like what you 
> > > > describe.
> > > > 
> > > > With an argument like that, you may as well disable nested loop joins.
> > > > I have seen enough cases where disabling nested loop joins, without any
> > > > deeper analysis, made very slow queries reasonably fast.
> > > 
> > > My argument is that parallel query should not be allowed to be invoked 
> > > without
> > > user intervention.  Yes, nestedloop can have a similar impact, but let's 
> > > take
> > > a look at the breakdown at scale of PQ:
> > >
> > > [pgbench run that shows that parallel query is bad for throughput]
> > 
> > I think that your experiment is somewhat misleading.  Sure, if you
> > overload the machine with parallel workers, that will eventually also
> > harm the query response time.  But many databases out there are not
> > overloaded, and the shorter response time that parallel query offers
> > makes many users happy.
> 
> It's not intended to be misleading, sorry for that.  I agree that PQ can have 
> a positive effect, the point is that our current defaults will very quickly 
> take a basic workload on a modest (16 CPU box) and quickly swamp it with a 
> concurrency of 5, which is counter-intuitive, hard to debug, and usually not 
> desired (again, in the case of a plan that silently invokes parallelism).
> 
> FWIW, setting max_parallel_workers_per_gather to 0 by default only disables 
> automatic PQ selection behind a SIGHUP (or with a user context), users can 
> easily re-enable it if they think want without having to restart (similar to 
> parallel_setup_cost, but without the uncertainty).
> 
> During my testing, I actually found (again, at concurrency = 5) that the 
> default max_parallel_workers and max_worker_processes of 8 is not high 
> enough.  If the default max_parallel_workers_per_gather is 0, then we'd be 
> able to to crank those defaults up (especially max_worker_processes which 
> requires a restart).
> 
> 
> > 
> > It is well known that what is beneficial for response time is detrimental
> > for the overall throughput and vice versa.
> 
> It is well-known.  What's not is that the postgres defaults will quickly 
> swamp a machine with parallelism.  That's a lesson that many only learn after 
> it's happened to them.  ISTM that the better path is to let someone try to 
> optimize with parallelism rather than have to fight with it during an 
> emergent event. 
> 
> IOW: I'd rather know that I'm walking into a marsh with rattlesnakes rather 
> than find out after I'd been bitten.
> 
> > Now parallel query clearly is a feature that is good for response time
> > and bad for throughput, but that is not necessarily wrong.
> 
> Agreed, I do like and use parallel query. I just don't think it's wise that 
> we allow that planner to make that decision on a user's behalf when the 
> overhead is this high and the concurrency behavior falls apart so 
> spectacularly fast.
> 
> > 
> > Essentially, you are arguing that the default configuration should favor
> > throughput over response time.
> 
> That's one take on it, I'm actually saying that the default configuration 
> should protect medium-sized systems from unintended behavior that quickly 
> degrades performance while being very hard to identify and quantify. 
> 
> 
> > 
> > > Going back to the original commit which enabled PQ by default[1], it was
> > > done so that the feature would be tested during beta.  I think it's time
> > > that we limit the accidental impact this can have to users by disabling
> > > the feature by default.
> > 
> > I disagree.
> > My experience is that parallel query often improves the user experience.
> > Sure, there are cases where I recommend disabling it, but I think that
> > disabling it by default would be a move in the wrong direction.
> > 
> > On the other hand, I have also seen cases where bad estimates trigger
> > parallel query by mistake, making queries slower.  So I'd support an
> > effort to increase the default value for "parallel_setup_cost".
> 
> I'm open to discussing a value for parallel_setup_cost that protects users 
> from runaway here, I just haven't been able to find a value that allows users 
> to be protected while simultaneously allowing users who want automatic 
> parallel-plan selection to take advantage of it.


I'd like to re-open the discussion for this commitfest item.  I still have not 
been able to find a value for parallel_setup_cost that makes good decisions 
about parallelism on a user's behalf.  I believe that setting the SIGHUP-able 
max_parallel_workers_per_gather to 0 by default is still the best way to 
prevent runaway parallel execution behavior. 


> 
> What I've found (and it sounds somewhat similar to what you are saying) is 
> that if you use parallelism intentionally and design for it (hardware, 
> concurrency model, etc...) it's very, very powerful.  In cases where it 'just 
> kicks in', I haven't seen an example that makes users happy. 
> 
> 
> 
> > 
> > Yours,
> > Laurenz Albe
> > 
> 
> --
> Scott Mead
> Amazon Web Services
> sc...@meads.us

--
Scott Mead
sc...@meads.us

Re: Disable parallel query by default

Reply via email to