Re: cleanup vmm_start_vm, simplifying fd cleanup

Dave Voutila Tue, 11 Apr 2023 07:02:07 -0700


Philip Guenther <guent...@gmail.com> writes:


> On Fri, Apr 7, 2023 at 9:44 AM Dave Voutila <d...@sisu.io> wrote:
> ...
>
>  Touch longer, but won't generate ktrace noise by blind calls to close(2)
>  and also accounts for the other error conditions (EINTR, EIO).
>
>  For EIO, not sure yet how we want to handle it other than log it.
>
>  For EINTR, we want to account for that race and make sure we retry since
>  the vmm process is long-lived and could inadvertently keep things like
>  tty fds or disk image fds open after the guest vm terminates.
>
> So, this is an area where
>  * the current POSIX standard leaves the behavior unspecified
>  * everyone (except HP-UX) made close() *always* invalidate the fd, even if 
> it fails with EINTR
>  * the next version of POSIX requires close() to *not* invalidate the fd if 
> it fails with EINTR:
>     * in those cases where something was interrupted but close() invalidates 
> the fd and previously return EINTR
>       they are supposed to now return EINPROGRESS
>     * also, a new posix_close() interface and POSIX_CLOSE_RESTART #define, 
> yadda yadda,
>  * OpenBSD has not changed to that behavior or added those interfaces at this 
> time, AND
>  * I'm not sure vmd can even hit this issue due to what it'll be closing!
>
> Last things first: when does close() return EINTR?  My understanding is that 
> the only case that happens in OpenBSD is
> interrupt of a FINAL close (no other dup()s open, etc) of a file on NFS with 
> unflushed writes.  I think it could happen
> on other systems on interrupt of close of a rewinding-tape devices (e.g. 
> /dev/rst0) but I don't actually see how that
> would happen in OpenBSD.  Are any of the fds in this case subject to those 
> sorts of concerns?  If they're all either
> duplicates in a forked process, or file types that can't block in close like 
> that (say, sockets and /dev/vmd devices or
> such), then this is all ignorable.

Right, this *should not* be a final close, *but* there's a race here
because this close(2) happens in the parent post-fork(2). If the child
process dies and any fd cleanup occurs there, the close(2) in the parent
*could* be the final, right?

These are a mix of file types: tap(4) devices, a pty(4) device, and
regular files...which may or may not be on NFS.

So if I'm following correctly, there is the edge case where this could
be a final close of a regular file on NFS, *but* at this point in vmd(8)
there aren't any writes so I think this is ignorable after all.

I'm inclined then to remove the loop. While I have an ok from mlarkin@ I
haven't committed yet due to $dayjob. Loop or no loop? I'll defer to you
guenther@.

>
> Meanwhile, the loop you're writing is correct for a future that OpenBSD has 
> not embraced.  You could kind of future-proof
> it by making the loop conditional on POSIX_CLOSE_RESTART being defined: if 
> it's defined then presumably close() has the
> new POSIX behavior, if not (the current state!) then the loop is useless or 
> actively dangerous, depending on whether the
> process is multi-threaded.

In this case, the vmm process in vmd is single-threaded.

>
> I suspect at some point we'll:
>  * map EINTR to EINPROGRESS in sys_close()
>  * #define POSIX_CLOSE_RESTART 0 /* explicitly blessed by POSIX */
>  * add to libc  int posix_close(int fd, int flags) { int ret = close(fd); 
> return flags ? EINVAL : ret; }
>
> 'cause it'll mean less patching for ports, but someone will need to check for 
> anything in tree that tries to be clever
> about close() returning EINTR.
>
> Overall , a great use of everyone's time. /s
>
> Philip

Re: cleanup vmm_start_vm, simplifying fd cleanup

Reply via email to