URL: <http://savannah.gnu.org/bugs/?22861>
Summary: bogus answer from pflocal to io_select SELECT_URG Project: The GNU Hurd Submitted by: None Submitted on: Sunday 04/06/2008 at 23:52 UTC Category: Hurd Servers Severity: 3 - Normal Priority: 5 - Normal Item Group: None Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Discussion Lock: Any Reproducibility: None Size (loc): None Planned Release: None Effort: 0.00 Wiki-like text discussion box: _______________________________________________________ Details: If a program first calls pipe() and then calls select() in such a way that one of the pipe file descriptors is in exceptfds but not in readfds nor writefds, the select() call returns almost immediately and leaves the pipe in exceptfds, as if an exception had occurred there. This bug completely wrecks the ELinks web browser, which uses a pipe for internal communication and ends up closing the pipe because it thinks an error has occurred. rpctrace shows that _hurd_select in glibc sends an io_select_request (4) to pflocal and gets back io_select_reply (0 0), which it considers a "bogus answer": if the pipe is not ready for any I/O, then it should not have responded yet. And so, glibc substitutes SELECT_ALL, which includes SELECT_URG. Why did pflocal send the bogus answer, then? When *select_type is initially SELECT_URG without other flags, S_io_select in hurd/pflocal/io.c first resets *select_type to zero and then calls pipe_pair_select in hurd/libpipe/pipe.c. That notices *select_type is neither SELECT_READ nor SELECT_WRITE, and apparently assumes it is SELECT_READ | SELECT_WRITE and waits for either condition. One of the conditions is satisfied, but *select_type remains zero, and S_io_select then propagates the zero to io_select_response. Possible fixes: (a) Change _hurd_select in glibc to completely ignore responses that have select_type=0. I.e. change the protocol so that a server can send back io_select_reply (0 0) to mean that the requested events cannot ever occur. This would however make the timeout in _hurd_select more complex to implement. Currently, _hurd_select simply passes the timeout to Mach when waiting for the first response to the io_select_requests it has sent out; after getting the first response, it collects any further replies with a zero timeout, and then returns. If _hurd_select were changed to ignore the first response in some cases, it would have to keep track of how much time it has spent. (b) Change pflocal to detect when io_select_request is asking solely for events that cannot ever occur, and sleep until the caller closes the reply port. Similar changes may be needed in other servers. The downside is that this would needlessly tie up a thread in the server. (c) As in (b) but instead of sleeping, discard the right to the reply port without sending any response, and return the thread to more productive use. However, there seem to be two problems with this. Firstly, it may be difficult to make MIG-generated stubs let the server skip the response. The Mach 3 Server Writer's Guide mentions that the server routine can return MIG_NO_REPLY to to this, but I did not find that symbol in the GNU MiG 1.3 sources. Secondly, discarding the reply right might trigger a no-senders notification in the client. It appears though that _hurd_select does not currently request such notification. This bug was discussed in December 2007 on the debian-hurd and elinks-dev mailing lists, under the subject "The Links/Links2/ELinks browsers are unusable on Debian GNU/Hurd". _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?22861> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/