On Tue, Sep 17, 2019 at 4:20 PM Josh Hunt <joh...@akamai.com> wrote: > > I was running some tests recently with the udpgso_bench_tx benchmark in > selftests and noticed that in some configurations it reported sending > more than line rate! Looking into it more I found that I was overflowing > the qdisc queue and so it was sending back NET_XMIT_DROP however this > error did not propagate back up to the application and so it assumed > whatever it sent was done successfully. That's when I learned about > IP_RECVERR and saw that the benchmark isn't using that socket option. > > That's all fairly straightforward, but what I was hoping to get > clarification on is where is the line drawn on when or when not to send > ENOBUFS back to the application if IP_RECVERR is *not* set? My guess > based on going through the code is that as long as the packet leaves the > stack (in this case sent to the qdisc) that's where we stop reporting > ENOBUFS back to the application, but can someone confirm?
Once a packet is queued the system call may return, so any subsequent drops after dequeue are not propagated back. The relevant rc is set in __dev_xmit_skb on q->enqueue. On setups with multiple devices, such as a tunnel or bonding path, enqueue on the lower device is similar not propagated. > For example, we sanitize the error in udp_send_skb(): > send: > err = ip_send_skb(sock_net(sk), skb); > if (err) { > if (err == -ENOBUFS && !inet->recverr) { > UDP_INC_STATS(sock_net(sk), > UDP_MIB_SNDBUFERRORS, is_udplite); > err = 0; > } > } else > > > but in udp_sendmsg() we don't: > > if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, > &sk->sk_socket->flags)) { > UDP_INC_STATS(sock_net(sk), > UDP_MIB_SNDBUFERRORS, is_udplite); > } > return err; That's interesting. My --incorrect-- understanding until now had been that IP_RECVERR does nothing but enable optional extra detailed error reporting on top of system call error codes. But indeed it enables backpressure being reported as a system call error that is suppressed otherwise. I don't know why. The behavior precedes git history. > In the case above it looks like we may only get ENOBUFS for allocation > failures inside of the stack in udp_sendmsg() and so that's why we > propagate the error back up to the application? Both the udp lockless fast path and the slow corked path go through udp_send_skb, so the backpressure is suppressed consistently across both cases. Indeed the error handling in udp_sendmsg then is not related to backpressure, but to other causes of ENOBUF, i.e., allocation failure.