On Mon, Feb 10, 2025 at 2:40 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > ... > Problem statement: You want to be able to batch I/O submission, ie > make a single call to ioring_enter() (and other mechanisms) to start > several I/Os, but the code that submits is inside StartReadBuffers() > and the code that knows how many I/Os it wants to start now is at a > higher level, read_stream.c and in future elsewhere. So you invented > this flag to tell StartReadBuffers() not to call > pgaio_submit_staged(), because you promise to do it later, via this > staging list. Additionally, there is a kind of programming rule here > that you *must* submit I/Os that you stage, you aren't allowed to (for > example) stage I/Os and then sleep, so it has to be a fairly tight > piece of code. > > Would the API be better like this?: When you want to create a batch > of I/Os submitted together, you wrap the work in pgaio_begin_batch() > and pgaio_submit_batch(), eg the loop in read_stream_lookahead(). > Then bufmgr wouldn't need this flag: when it (or anything else) calls > smgrstartreadv(), if there is not currently an explicit batch then it > would be submitted immediately, and otherwise it would only be staged. > This way, batch construction (or whatever word you prefer for batch) > is in a clearly and explicitly demarcated stretch of code in one > lexical scope (though its effect is dynamically scoped just like the > staging list itself because we don't want to pass explicit I/O > contexts through the layers), but code that doesn't call those and > reaches AsyncReadBuffer() or whatever gets an implicit batch of size > one and that's also OK. Not sure what semantics nesting would have > but I doubt it matters much.
I like this idea. If we want to submit a batch, then just submit a batch. James