On 08/05/18 09:40, Daniel P. Berrangé wrote: > Libvirt CI recently started running "make check" on our FreeBSD 10 & 11 > hosts, and we see frequent failure of the test-poll unit test in gnulib > IIUC, gnulib is not actually building a replacement poll() function > on FreeBSD, it is merely running the gnulib test suite against the > FreeBSD system impl of poll() and hitting this behaviour. > > $ ./gnulib/tests/test-poll > Unconnected socket test... passed > Connected sockets test... failed (expecting POLLHUP after shutdown) > General socket test with fork... failed (expecting POLLHUP after shutdown) > Pipe test... passed > > Looking at the first failure in test_socket_pair method. > > It sets up a listener socket, connects a client, accepts the client > and then closes the remote end. It expects the server's client socket > to thus show POLLHUP or POLLERR. > > When it fails, the poll() is in fact still showing POLLOUT. If you put > a sleep between the close () and poll() calls, it will succeed. > > So, IIUC, the test is racing with the BSD kernel's handling of socket > close - the test can't assume that just because the remote end of the > client has been closed, that poll() will immediately show POLLHUP|ERR. > > Anyone have ideas on how to make this test more reliable and not depend > on the kernel synchronizing the close() state with poll() results > immediately ? > > Regards, > Daniel >
Yes that test looks racy as the network shutdown is async. How about we s/nowait/wait/, and only check for input events. The following works on Linux at least: --- tests/test-poll.c 2018-05-14 23:46:09.595448490 -0700 +++ pb/gltests/test-poll.c 2018-05-14 23:45:46.827048159 -0700 @@ -334,8 +334,9 @@ test_pair (c1, c2); close (c1); - ASSERT (write (c2, "foo", 3) == 3); - if ((poll1_nowait (c2, POLLIN | POLLOUT) & (POLLHUP | POLLERR)) == 0) + + (void) write (c2, "foo", 3); // Initiate shutdown + if ((poll1_wait (c2, POLLIN) & (POLLHUP | POLLERR)) == 0) failed ("expecting POLLHUP after shutdown"); close (c2);