Hey net devs,

I would like some clarity on a problem I ran into last week. I was 
diagnosing a DNS issue last week and got very side tracked by how 
netstat reported stats to me. My issue was that UDP packets were being 
dropped by all UDP sockets on the host, so when I ran `nestat -naus` and 
it informed me that UdpInErrors 
was my main problem I spent a day trying to figure out what 
application/mechanism was dropping UDP packets on the host. My 
suspicion, based on the statistic I was seeing, was that it was going to 
be something like BPF or a security module. To be fair to me, these two 
mechanisms do indeed report their drops within this statistic 
Imagine my surprise when I discovered that the error that was actually 
happening, was that the global UDP socket min was being reached, and all 
the host UDP sockets were, indeed, experiencing buffer errors. The 
problem is that wihtin the regular UDP socket datapath 
`UDP_MIB_RCVBUFERRORS` only seem to be set here 
when the error is "ENOMEM". However, when `__sk_mem_raise_allocated` 
it reports "ENOBUF". The issue ended up being an application that was 
not processing it's backlog, because it wasn't closing old UDP sockets. 
IMO, I would have gotten to this dianosis quicker if when I ran `nestat 
-naus` I had gotten UdpRcvBuffErrors (`UDP_MIB_RCVBUFERRORS`) instead of 
UdpInErrors. I realize that it is too late to change this error 
reporting now, because it would break user space, but I think a new 
error could be added to the kernel for UDP, such as 
UdpRcvBuffGlobalErrors, or something like that, which could be double 
reported. I think this would be a real time saver for folks, because I 
really think UdpInErrors is counter-intuitively incorrect.


Nate Sweet

Reply via email to