On 2021-01-08 2:21 p.m., Shannon Nelson wrote:
On 1/8/21 10:26 AM, Jesse Brandeburg wrote:
Shannon Nelson wrote:
On 1/6/21 1:55 PM, Jesse Brandeburg wrote:
When drivers call the various receive upcalls to receive an skb
to the stack, sometimes that stack can drop the packet. The good
news is that the return code is given to all the drivers of
NET_RX_DROP or GRO_DROP. The bad news is that no drivers except
the one "ice" driver that I changed, check the stat and increment
If the stack is dropping the packet, isn't it up to the stack to track
that, perhaps with something that shows up in netstat -s? We don't
really want to make the driver responsible for any drops that happen
above its head, do we?
I totally agree!
In patch 2/2 I revert the driver-specific changes I had made in an
earlier patch, and this patch *was* my effort to make the stack show the
drops.
Maybe I wasn't clear. I'm seeing packets disappear during TCP
workloads, and this GRO_DROP code was the source of the drops (I see it
returning infrequently but regularly)
The driver processes the packet but the stack never sees it, and there
were no drop counters anywhere tracking it.
My point is that the patch increments a netdev counter, which to my mind
immediately implicates the driver and hardware, rather than the stack.
As a driver maintainer, I don't want to be chasing driver packet drop
reports that are a stack problem. I'd rather see a new counter in
netstat -s that reflects the stack decision and can better imply what
went wrong. I don't have a good suggestion for a counter name at the
moment.
I guess part of the issue is that this is right on the boundary of
driver-stack. But if we follow Eric's suggestions, maybe the problem
magically goes away :-) .
So: How does one know that the stack-upcall dropped a packet because
of GRO issues? Debugging with kprobe or traces doesnt count as an
answer.
cheers,
jamal