Re: Automatically sizing the IO worker pool

Dmitry Dolgov Tue, 27 May 2025 10:56:17 -0700

On Mon, May 26, 2025, 8:01 AM Thomas Munro <thomas.mu...@gmail.com> wrote:


> But ... I'm not even sure if we can say that our
> I/O arrivals have a Poisson distribution, since they are not all
> independent.
>

Yeah, a good point, one have to be careful with assumptions about
distribution -- from what I've read many processes in computer systems are
better described by a Pareto. But the beauty of the queuing theory is that
many results are independent from the distribution (not sure about
dependencies though).

In this version I went back to basics and built something that looks
> more like the controls of a classic process/thread pool (think Apache)
> or connection pool (think JDBC), with a couple of additions based on
> intuition: (1) a launch interval, which acts as a bit of damping
> against overshooting on brief bursts that are too far apart, and (2)
> the queue length > workers * k as a simple way to determine that
> latency is being introduced by not having enough workers.  Perhaps
> there is a good way to compute an adaptive value for k with some fancy
> theories, but k=1 seems to have *some* basis: that's the lowest number
> which the pool is too small and *certainly* introducing latency, but
> any lower constant is harder to defend because we don't know how many
> workers are already awake and about to consume tasks.  Something from
> queuing theory might provide an adaptive value, but in the end, I
> figured we really just want to know if the queue is growing ie in
> danger of overflowing (note: the queue is small!  64, and not
> currently changeable, more on that later, and the overflow behaviour
> is synchronous I/O as back-pressure).  You seem to be suggesting that
> k=1 sounds too low, not too high, but there is that separate
> time-based defence against overshoot in response to rare bursts.
>

I probably had to start with a statement that I find the current approach
reasonable, and I'm only curious if there is more to get out of it. I
haven't benchmarked the patch yet (plan getting to it when I'll get back),
and can imagine practical considerations significantly impacting any
potential solution.

About control theory... yeah.  That's an interesting bag of tricks.
> FWIW Melanie and (more recently) I have looked into textbook control
> algorithms at a higher level of the I/O stack (and Melanie gave a talk
> about other applications in eg VACUUM at pgconf.dev).  In
> read_stream.c, where I/O demand is created, we've been trying to set
> the desired I/O concurrency level and thus lookahead distance with
> adaptive feedback.  We've tried a lot of stuff.  I hope we can share
> some concept patches some time soon, well, maybe in this cycle.  Some
> interesting recent experiments produced graphs that look a lot like
> the ones in the book "Feedback Control for Computer Systems" (an easy
> software-person book I found for people without an engineering/control
> theory background where the problems match our world more closely, cf
> typical texts that are about controlling motors and other mechanical
> stuff...).  Experimental goals are: find the the smallest concurrent
> I/O request level (and thus lookahead distance and thus speculative
> work done and buffers pinned) that keeps the I/O stall probability
> near zero (and keep adapting, since other queries and applications are
> sharing system I/O queues), and if that's not even possible, find the
> highest concurrent I/O request level that doesn't cause extra latency
> due to queuing in lower levels (I/O workers, kernel, ...,  disks).
> That second part is quite hard.  In other words, if higher levels own
> that problem and bring the adaptivity, then perhaps io_method=worker
> can get away with being quite stupid.  Just a thought...
>

Looking forward to it. And thanks for the reminder about the talk, wanted
to watch it already long time ago, but somehow didn't managed yet.

>

Re: Automatically sizing the IO worker pool

Reply via email to