On Mon, Feb 17, 2025 at 7:02 AM Andres Freund <and...@anarazel.de> wrote: > I don't really know enough about IPC::Run's internals to answer. My > interpretation of how it might work, purely from observation, is that it opens > one tcp connection for each "pipe" and that that's what's introducing the > potential of reordering, as the different sockets can have different delivery > timeframes. If that's it, it seems proxying all the pipes through one > connection might be an option.
I had a couple of ideas about how to get rid of the intermediate subprocess. Obviously it can't convert "two pipes are ready" into two separate socket send() calls that preserve the original order, as it doesn't know them (unless perhaps it switches to completion-based I/O). But really, the whole design is ugly and slow. If we have some capacity to improve Run::IPC, I think we should try to get rid of the pipe/socket bridge and plug either a pipe or a socket directly into the target subprocess. But which one? 1. Pipes only: Run::IPC could use IOCP or WaitForMultipleEvents() instead of select()/poll(). 2. Sockets only: Apparently you can give sockets directly to subprocesses as stdin/stdout/stderr: https://stackoverflow.com/questions/4993119/redirect-io-of-process-to-windows-socket The Run::IPC comments explain that the extra process was needed to be able to forward all data even if the target subprocess exits without closing the socket (the linger stuff we have met before in PostgreSQL itself). I suspect that if we went that way, maybe asynchronous I/O would fix that too (see my other thread with guesses and demos on that topic), but it might not be race-free. I don't know. I'd like to know for PostgreSQL's own sake, but for Run::IPC I think I'd prefer option 1 anyway: if you have to write new native Windows API interactions either way, you might as well go with the normal native way for Windows processes to connect standard I/O streams.