On Thu, May 04, 2023 at 08:07:44PM -0700, Greg Steuck wrote: > I'm debugging a non-trivial multithreaded unit test in the current > version of lang/ghc. It runs into some kind of unexpected condition not > handled well by GHC. I suspect we do something non-standard to cause > this behavior. These two ktrace items illustrate the issue: > > 12550/209588 T21651 CALL kevent(217,0x211906e98,1,0,0,0x211906e78) > 12550/209588 T21651 STRU struct kevent { ident=13, filter=EVFILT_WRITE, > flags=0x11<EV_ADD|EV_ONESHOT>, fflags=0x2<NOTE_EOF>, data=0, udata=0x0 } > 12550/209588 T21651 RET kevent -1 errno 32 Broken pipe > > 12550/209588 T21651 CALL kevent(217,0x211906ee8,1,0,0,0x211906ec8) > 12550/209588 T21651 STRU struct kevent { ident=13, filter=EVFILT_WRITE, > flags=0x2<EV_DELETE>, fflags=0x2<NOTE_EOF>, data=0, udata=0x0 } > 12550/209588 T21651 RET kevent -1 errno 2 No such file or directory > > errno 2 is the reason GHC goes berserk, but it seems like the earlier > return of errno 32 (EPIPE) is the first time things go wrong. I don't > see EPIPE documented as a valid error in kevent(2). It's also nowhere > to be found in sys/kern/kern_event.c. This errno value pops up from > some other place that I can't quickly locate. > > So, is EPIPE a valid errno which we should document or a kernel bug?
The EPIPE error relates to the situation where a kevent(2) EVFILT_WRITE call on a pipe races with the closing of the pipe's other end. If the close(2) happens before the kevent registration, kevent(2) returns EPIPE. If the close(2) happens after the kevent(2) call, the registered event will trigger. The EPIPE error is a legacy feature of the kqueue implementation. I think the system should work correctly without it. When the pipe's write side has already been closed, the EVFILT_WRITE event can still be registered. It just triggers immediately. As for the ENOENT error from kevent(2), I think the unit test behaves incorrectly by trying to delete a non-existent event. The registration failed, after all. Below is a patch that removes the EPIPE special case. Could you try it? Index: kern/sys_generic.c =================================================================== RCS file: src/sys/kern/sys_generic.c,v retrieving revision 1.155 diff -u -p -r1.155 sys_generic.c --- kern/sys_generic.c 25 Feb 2023 09:55:46 -0000 1.155 +++ kern/sys_generic.c 6 May 2023 17:10:27 -0000 @@ -769,12 +769,6 @@ pselregister(struct proc *p, fd_set *pib * __EV_SELECT */ error = 0; break; - case EPIPE: /* Specific to pipes */ - KASSERT(kev.filter == EVFILT_WRITE); - FD_SET(kev.ident, pobits[1]); - (*ncollected)++; - error = 0; - break; case ENXIO: /* Device has been detached */ default: goto bad; @@ -1073,10 +1067,6 @@ again: goto again; } break; - case EPIPE: /* Specific to pipes */ - KASSERT(kevp->filter == EVFILT_WRITE); - pl->revents |= POLLHUP; - break; default: DPRINTFN(0, "poll err %lu fd %d revents %02x serial" " %lu filt %d ERROR=%d\n", Index: kern/sys_pipe.c =================================================================== RCS file: src/sys/kern/sys_pipe.c,v retrieving revision 1.145 diff -u -p -r1.145 sys_pipe.c --- kern/sys_pipe.c 12 Feb 2023 10:41:00 -0000 1.145 +++ kern/sys_pipe.c 6 May 2023 17:10:27 -0000 @@ -857,9 +857,13 @@ pipe_kqfilter(struct file *fp, struct kn break; case EVFILT_WRITE: if (wpipe == NULL) { - /* other end of pipe has been closed */ - error = EPIPE; - break; + /* + * The other end of the pipe has been closed. + * Since the filter now always indicates a pending + * event, attach the knote to the read side to proceed + * with the registration. + */ + wpipe = rpipe; } kn->kn_fop = &pipe_wfiltops; kn->kn_hook = wpipe;