Rainer Weikusat <r...@doppelsaurus.mobileactivedefense.com> writes: > Jason Baron <jba...@akamai.com> writes: >> From: Jason Baron <jba...@akamai.com> >> >> The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait >> queue associated with the socket s that we've called poll() on, but it also >> calls sock_poll_wait() for a remote peer socket's wait queue, if it's >> connected. >> Thus, if we call poll()/select()/epoll() for the socket s, there are then >> a couple of code paths in which the remote peer socket s2 and its associated >> peer_wait queue can be freed before poll()/select()/epoll() have a chance >> to remove themselves from this remote peer socket s2's wait queue. > > [...] > >> This works because we will continue to get POLLOUT wakeups from >> unix_write_space(), which is called via sock_wfree(). > > As pointed out in my original comment, this doesn't work (as far as I > can/ could tell) because it will only wake up sockets which had a chance > to enqueue datagrams to the queue of the receiving socket as only > skbuffs enqueued there will be consumed. A socket which is really > waiting for space in the receiving queue won't ever be woken up in this > way.
Program which shows that (on 3.2.54 + "local modification", with the 2nd sock_poll_wait commented out): --------------- #include <fcntl.h> #include <stdio.h> #include <string.h> #include <sys/socket.h> #include <sys/un.h> #include <sys/poll.h> #include <sys/wait.h> #include <unistd.h> int main(void) { struct sockaddr_un sun; struct pollfd pfd; int tg, sk0, sk1, rc; char buf[16]; sun.sun_family = AF_UNIX; tg = socket(AF_UNIX, SOCK_DGRAM, 0); strncpy(sun.sun_path, "/tmp/tg", sizeof(sun.sun_path)); unlink(sun.sun_path); bind(tg, (struct sockaddr *)&sun, sizeof(sun)); sk0 = socket(AF_UNIX, SOCK_DGRAM, 0); connect(sk0, (struct sockaddr *)&sun, sizeof(sun)); sk1 = socket(AF_UNIX, SOCK_DGRAM, 0); connect(sk1, (struct sockaddr *)&sun, sizeof(sun)); fcntl(sk0, F_SETFL, fcntl(sk0, F_GETFL) | O_NONBLOCK); fcntl(sk1, F_SETFL, fcntl(sk1, F_GETFL) | O_NONBLOCK); while (write(sk0, "bla", 3) != -1); if (fork() == 0) { pfd.fd = sk1; pfd.events = POLLOUT; rc = poll(&pfd, 1, -1); _exit(0); } sleep(3); read(tg, buf, sizeof(buf)); wait(&rc); return 0; } ------------ For me, this blocks forever while it should terminate as soon as the datagram was read. Something else may have changed this behaviour in the meantime, though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/