The commit 7d4c04fc170087119727 ("net: add option to enable error queue packets waking select") (2013-03-28) introduced SO_SELECT_ERR_QUEUE, which masks POLLPRI with POLLERR event return in some socket poll callbacks.
POLLERR event issued with sock_queue_err_skb() did not wake up a poll when POLLERR is the only requested event because sk_data_ready() (sock_def_readable()) was used and it doesn't mask POLLERR in poll wake up: wake_up_interruptible_sync_poll(&wq->wait, EPOLLIN | EPOLLPRI | EPOLLRDNORM | EPOLLRDBAND); If POLLIN or POLLPRI are requested, for example, poll does wake up. POLLERR wakeup by requesting POLLPRI is possible without set SO_SELECT_ERR_QUEUE. All the option does is masking POLLPRI as a returned event before poll returns. poll would return anyway because of POLLERR. Also, the sentence "[...] enable software to wait on error queue packets without waking up for regular data on the socket." from the above commit is not true. A POLLIN event issued via sock_def_readable() wakes up threads waiting for POLLPRI, and vice versa. However, poll() does not return, sleeping again, as the requested events do not match events. The commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not waking applications when errors are enqueued") (2018-03-14) make POLLERR alone wake up poll. It replaces sk_data_ready() (sock_def_readable()) with sk_error_report() (sock_def_error_report()). This makes "POLLERR wake up by requesting POLLPRI" obsolete. Rationale: POLLIN-only and POLLERR-only wake up are useful when there is a receiving thread, a sending thread, and a thread that get transmit timestamps. The thread polling on POLLERR will not wake up when regular data arrives (POLLIN). The thread polling on POLLIN will not wake up when tx timestamps are ready (POLLERR). One solution is adding an option that disable POLLERR as requested event. This is in the Virtual File System subsystem, not in the network, though. This solves the problem of waking up other threads that not interested in error queue. Thus allowing a separate thread take care of error queue (useful for receiving transmit timestamps). However, this may not be a good solution as it depends on polling behavior. Thoughts? By the way, as the kernel development shouldn't break user space, SO_SELECT_ERR_QUEUE can become a compatibility option. P.S.: The kernel is slowly getting bigger (and perhaps messy) with compatibility code. One day, the compatibility code should be moved to a compatibility-only place, or completely dropped (unlikely). pasquali