Philip Guenther <guent...@gmail.com> writes:
> On Fri, Apr 7, 2023 at 9:44 AM Dave Voutila <d...@sisu.io> wrote: > ... > > Touch longer, but won't generate ktrace noise by blind calls to close(2) > and also accounts for the other error conditions (EINTR, EIO). > > For EIO, not sure yet how we want to handle it other than log it. > > For EINTR, we want to account for that race and make sure we retry since > the vmm process is long-lived and could inadvertently keep things like > tty fds or disk image fds open after the guest vm terminates. > > So, this is an area where > * the current POSIX standard leaves the behavior unspecified > * everyone (except HP-UX) made close() *always* invalidate the fd, even if > it fails with EINTR > * the next version of POSIX requires close() to *not* invalidate the fd if > it fails with EINTR: > * in those cases where something was interrupted but close() invalidates > the fd and previously return EINTR > they are supposed to now return EINPROGRESS > * also, a new posix_close() interface and POSIX_CLOSE_RESTART #define, > yadda yadda, > * OpenBSD has not changed to that behavior or added those interfaces at this > time, AND > * I'm not sure vmd can even hit this issue due to what it'll be closing! > > Last things first: when does close() return EINTR? My understanding is that > the only case that happens in OpenBSD is > interrupt of a FINAL close (no other dup()s open, etc) of a file on NFS with > unflushed writes. I think it could happen > on other systems on interrupt of close of a rewinding-tape devices (e.g. > /dev/rst0) but I don't actually see how that > would happen in OpenBSD. Are any of the fds in this case subject to those > sorts of concerns? If they're all either > duplicates in a forked process, or file types that can't block in close like > that (say, sockets and /dev/vmd devices or > such), then this is all ignorable. Right, this *should not* be a final close, *but* there's a race here because this close(2) happens in the parent post-fork(2). If the child process dies and any fd cleanup occurs there, the close(2) in the parent *could* be the final, right? These are a mix of file types: tap(4) devices, a pty(4) device, and regular files...which may or may not be on NFS. So if I'm following correctly, there is the edge case where this could be a final close of a regular file on NFS, *but* at this point in vmd(8) there aren't any writes so I think this is ignorable after all. I'm inclined then to remove the loop. While I have an ok from mlarkin@ I haven't committed yet due to $dayjob. Loop or no loop? I'll defer to you guenther@. > > Meanwhile, the loop you're writing is correct for a future that OpenBSD has > not embraced. You could kind of future-proof > it by making the loop conditional on POSIX_CLOSE_RESTART being defined: if > it's defined then presumably close() has the > new POSIX behavior, if not (the current state!) then the loop is useless or > actively dangerous, depending on whether the > process is multi-threaded. In this case, the vmm process in vmd is single-threaded. > > I suspect at some point we'll: > * map EINTR to EINPROGRESS in sys_close() > * #define POSIX_CLOSE_RESTART 0 /* explicitly blessed by POSIX */ > * add to libc int posix_close(int fd, int flags) { int ret = close(fd); > return flags ? EINVAL : ret; } > > 'cause it'll mean less patching for ports, but someone will need to check for > anything in tree that tries to be clever > about close() returning EINTR. > > Overall , a great use of everyone's time. /s > > Philip