On Thu, Sep 4, 2025 at 6:50 AM Jeff Davis <pg...@j-davis.com> wrote:
> I like the idea of some kind of fallback for multiple reasons. I
> noticed that if I set io_workers=1, and then I SIGSTOP that worker,
> then sequential scans make no progress at all until I send SIGCONT. A
> fallback to synchronous sounds more robust, and more similar to what we
> do with walwriter and bgwriter. (That may be 19 material, though.)

This seems like a non-problem.  Robustness against SIGSTOP of random
backends is not a project goal or reasonable goal, is it?  You can
SIGSTOP a backend doing IO in any historical release, possibly
blocking other backends too based on locks etc etc.

That said, it is quite reasonable to ask why it doesn't just start a
new worker, and that's just code maturity:

* I have a patch basically ready to commit for v19 (CF #5913) that
would automatically add more workers if you did that.  But even then
you could be unlucky and SIGSTOP a worker while it holds the
submission queue lock.
* I also had experimental versions that use a lock free queue, but it
didn't seem necessary given how hard it is to measure meaningful lock
contention so far; I guess it must be easier on a NUMA system and one
might wonder about per-NUMA-node queues, but that also feels a bit
questionable because if you had a very high end system you'd probably
be looking into better tuning including io_method=io_uring.


Reply via email to