On Sun, Aug 25, 2024 at 5:17 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > On 07/08/2024 17:59, Heikki Linnakangas wrote: > > I'm also wondering about the relationship between interrupts and > > latches. Currently, SendInterrupt sets a latch to wake up the target > > process. I wonder if it should be the other way 'round? Move all the > > wakeup code, with the signalfd, the self-pipe etc to interrupt.c, and in > > SetLatch, call SendInterrupt to wake up the target process? Somehow that > > feels more natural to me, I think. > > I explored that a little, see attached patch set. It's going towards the > same end state as your patches, I think, but it starts from different > angle. In a nutshell: > > Remove Latch as an abstraction, and replace all use of Latches with > Interrupts. When I originally created the Latch abstraction, I imagined > that we would have different latches for different purposes, but in > reality, almost all code just used the general-purpose "process latch". > this patch accepts that reality and replaces the Latch struct directly > with the interrupt mask in PGPROC.
Some very initial reactions: * I like it! * This direction seems to fit quite nicely with future ideas about asynchronous network I/O. That may sound unrelated, but imagine that a future version of WaitEventSet is built on Linux io_uring (or Windows iorings, or Windows IOCP, or kqueue), and waits for the kernel to tell you that network data has been transferred directly into a user space buffer. You could wait for the interrupt word to change at the same time by treating it as a futex[1]. Then all that other stuff -- signalfd, is_set, maybe_sleeping -- just goes away, and all we have left is one single word in memory. (That it is possible to do that is not really a coincidence, as our own Mr Freund asked Mr Axboe to add it[2]. The existing latch implementation techniques could be used as fallbacks, but when looked at from the right angle, once you squish all the wakeup reasons into a single word, it's all just an implementation of a multiplexable futex with extra steps.) * Speaking of other problems in other threads that might be solved by this redesign, I think I see the outline of some solutions to the problem of different classes of wakeup which you can handle at different times, using masks. There is a tension in a few places where we want to handle some kind of interrupts but not others in localised wait points, which we sort of try to address by holding interrupts or holding cancel interrupts, but it's not satisfying and there are some places where it doesn't work well. Needs a lot more thought, but a basic step would be: after old_interrupt_vector = pg_atomic_fetch_or_u32(interrupt_vector, new_bits), if (old_interrupt_vector & new_bits) == new_bits, then you didn't actually change any bits, so you probably don't really need to wake the other backend. If someone is currently unable to handle that type of interrupt (has ignored, ie not cleared, those bits) or is already in the process of handling it (is currently being rescheduled but hasn't cleared those bits yet), then you don't bother to wake it up. Concretely, it could mean that we avoid some of the useless wakeup storm problems we see in vacuum delays or while executing a query and not in a good place to handle sinval wakeups, etc. These are just some raw thoughts, I am not sure about the bigger picture of that topic yet. * Archeological note on terminology: the reason almost every relation database and all the literature uses the term "latch" for something like our LWLocks seems to be that latches were/are one of the kinds of system-provided mutex on IBM System/370 and modern descendents ie z/OS. Oracle and other systems that started as knock-offs of the IBM System R (the original SQL system, of which DB2 is the modern heir) continued that terminology, even though they ran on VMS or Unix or whatever. I would not be sad if we removed our unusual use of the term latch. [1] https://man7.org/linux/man-pages/man3/io_uring_prep_futex_wait.3.html [2] https://lore.kernel.org/lkml/20230720221858.135240-1-ax...@kernel.dk/