Hi, On 2020-03-02 18:28:41 +1300, Thomas Munro wrote: > I was reading through some old threads[1][2][3] while trying to figure > out how to add a new GUC to control I/O prefetching for new kinds of > things[4][5], and enjoyed Simon Riggs' reference to Jules Verne in the > context of RAID spindles. > > On 2 Sep 2015 14:54, "Andres Freund" <andres(at)anarazel(dot)de> wrote: > > > On 2015-09-02 18:06:54 +0200, Tomas Vondra wrote: > > > Maybe the best thing we can do is just completely abandon the "number of > > > spindles" idea, and just say "number of I/O requests to prefetch". > > > Possibly > > > with an explanation of how to estimate it (devices * queue length). > > > > I think that'd be a lot better. > > +many, though I doubt I could describe how to estimate it myself, > considering cloud storage, SANs, multi-lane NVMe etc. You basically > have to experiment, and like most of our resource consumption limits, > it's a per-backend limit anyway, so it's pretty complicated, but I > don't see how the harmonic series helps anyone. > > Should we rename it? Here are my first suggestions:
Why rename? It's not like anybody knew how to infer a useful value for effective_io_concurrency, given the math computing the actually effective prefetch distance... I feel like we'll just unnecessarily cause people difficulty by doing so. > random_page_prefetch_degree > maintenance_random_page_prefetch_degree I don't like these names. > Rationale for this naming pattern: > * "random_page" from "random_page_cost" I don't think we want to corner us into only ever using these for random io. > * leaves room for a different setting for sequential prefetching I think if we want to split those at some point, we ought to split it if we have a good reason, not before. It's not at all clear to me why you'd want a substantially different queue depth for both. > * "degree" conveys the idea without using loaded words like "queue" > that might imply we know something about the I/O subsystem or that > it's system-wide like kernel and device queues Why is that good? Queue depth is a pretty well established term. You can search for benchmarks of devices with it, you can correlate with OS config, etc. > * "maintenance_" prefix is like other GUCs that establish (presumably > larger) limits for processes working on behalf of many user sessions That part makes sense to me. > Whatever we call it, I don't think it makes sense to try to model the > details of any particular storage system. Let's use a simple counter > of I/Os initiated but not yet known to have completed (for now: it has > definitely completed when the associated pread() complete; perhaps > something involving real async I/O completion notification in later > releases). +1 Greetings, Andres Freund