On Thu, Nov 3, 2016 at 7:10 PM, Hannes Frederic Sowa <han...@stressinduktion.org> wrote: > [also cc'ed Andy, albeit this doesn't seem to solve his initial problem, > right? <http://www.spinics.net/lists/netdev/msg331753.html>]
Indeed, this does not help disambiguate the source of an error returned by a socketcall. It only reduces how often sk_err is leaked into the socket call datapath. Local errors and timestamps never set sk_err in the first place, so there can be no expectation that they set it on dequeue. I suspect that that line in sock_dequeue_err_skb originates from icmp errors, as those do set sk_err whenever queueing an error. So reenabling it when there are multiple errors on the queue makes sense, if all queued errors would be icmp errors. But they are not. And that path is racy, so there is no guarantee, either way. Processes that rely on syscall blocking to read the errqueue still block for icmp errors after this patch, due to the initial sk_err assignment on enqueue. Note that the comment in the patch that only TCP socket calls are blocked is probably incorrect. TCP has its own check in tcp_sendmsg and returns EPIPE when an sk_err is set, without clearing it. But other send, recv, connect, .. implementations include a path to sock_error that aborts the call (e.g., in __skb_try_recv_datagram), returns the value of sk_err, and clears it. This is an inconsistency between the protocols. Andy's original problem mentions recvmsg. We could probably make that unambiguous by returning a cmsg with the value of sk_err if RECVERR is set and sk_err is the source of the error. But this still does not solve the general issue, as other socketcalls, like send and connect, can also return sk_err. We can modify the number of places that check for it and currently abort system calls. Mainly sock_error(). One option is to add a socket option (SOL_SOCKET, as sk_err is not limited to when INET_RECVERR is set) to make that a noop. If the caller sets this option it instead has to wait with poll to receive POLLERR and call getsockopt SO_ERROR and recv MSG_ERRQUEUE to get the asynchronous error.