On Wed, 2018-11-21 at 22:29 +0100, Paweł Staszewski wrote: > W dniu 21.11.2018 o 22:14, Toke Høiland-Jørgensen pisze: > > David Ahern <dsah...@gmail.com> writes: > > > > > Paweł ran some more XDP tests yesterday and from it found a > > > couple of > > > issues. One is a panic in the mlx5 driver unloading the bpf > > > program > > > (mlx5e_xdp_xmit); he will send a send a separate email for that > > > problem. > > Same as this one, I guess? > > > > https://marc.info/?l=linux-netdev&m=153855905619717&w=2 > > Yes same as this one. > > When there is no traffic (for example with xdp_fwd program loaded) > or > there is not much traffic like 1k frames per second for icmp - i can > load/unload without crashing kernel > > But when i push tests with pktgen and use more than >50k pps for udp > - > then unbinding xdp_fwd program makes kernel to panic :) >
Yea, this is not precisely mlx5 issue. this is one of the issues we discussed at LPC, and i think we all agreed that the XDP redirect infrastructure must allow different driver to sync when they are changing configurations or disabling XPD tx for a moment, so the fix must be in the XDP redirect infrastructure. here is the issue description and a temp fix that i provided to Toke: https://marc.info/?l=linux-netdev&m=154023109526642&w=2 patch: https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c > > > > > The problem I wanted to discuss here is statistics for XDP > > > context. The > > > short of it is that we need consistency in the counters across > > > NIC > > > drivers and virtual devices. Right now stats are specific to a > > > driver > > > with no clear accounting for the packets and bytes handled in > > > XDP. > > > > > > For example virtio has some stats as device private data > > > extracted via > > > ethtool: > > > $ ethtool -S eth2 | grep xdp > > > ... > > > rx_queue_3_xdp_packets: 5291 > > > rx_queue_3_xdp_tx: 0 > > > rx_queue_3_xdp_redirects: 5163 > > > rx_queue_3_xdp_drops: 0 > > > ... > > > tx_queue_3_xdp_tx: 5163 > > > tx_queue_3_xdp_tx_drops: 0 > > > > > > And the standard counters appear to track bytes and packets for > > > Rx, but > > > not Tx if the packet is forwarded in XDP. > > > > > > Similarly, mlx5 has some counters (thanks to Jesper and Toke for > > > helping > > > out here): > > > > > > $ ethtool -S mlx5p1 | grep xdp > > > rx_xdp_drop: 86468350180 > > > rx_xdp_redirect: 18860584 > > > rx_xdp_tx_xmit: 0 > > > rx_xdp_tx_full: 0 > > > rx_xdp_tx_err: 0 > > > rx_xdp_tx_cqe: 0 > > > tx_xdp_xmit: 0 > > > tx_xdp_full: 0 > > > tx_xdp_err: 0 > > > tx_xdp_cqes: 0 > > > ... > > > rx3_xdp_drop: 86468350180 > > > rx3_xdp_redirect: 18860556 > > > rx3_xdp_tx_xmit: 0 > > > rx3_xdp_tx_full: 0 > > > rx3_xdp_tx_err: 0 > > > rx3_xdp_tx_cqes: 0 > > > ... > > > tx0_xdp_xmit: 0 > > > tx0_xdp_full: 0 > > > tx0_xdp_err: 0 > > > tx0_xdp_cqes: 0 > > > ... > > > > > > And no accounting in standard stats for packets handled in XDP. > > > > > > And then if I understand Jesper's data correctly, the i40e driver > > > does > > > not have device specific data: > > > > > > $ ethtool -S i40e1 | grep xdp > > > [NOTHING] > > > > > > > > > But rather bumps the standard counters: > > > > > > sudo ./xdp_rxq_info --dev i40e1 --action XDP_DROP > > > > > > Running XDP on dev:i40e1 (ifindex:3) action:XDP_DROP > > > options:no_touch > > > XDP stats CPU pps issue-pps > > > XDP-RX CPU 1 36,156,872 0 > > > XDP-RX CPU total 36,156,872 > > > > > > RXQ stats RXQ:CPU pps issue-pps > > > rx_queue_index 1:1 36,156,878 0 > > > rx_queue_index 1:sum 36,156,878 > > > > > > > > > $ ethtool_stats.pl --dev i40e1 > > > > > > Show adapter(s) (i40e1) statistics (ONLY that changed!) > > > Ethtool(i40e1 ) stat: 2711292859 ( 2,711,292,859) <= > > > port.rx_bytes /sec > > > Ethtool(i40e1 ) stat: 6274204 ( 6,274,204) <= > > > port.rx_dropped /sec > > > Ethtool(i40e1 ) stat: 42363867 ( 42,363,867) <= > > > port.rx_size_64 /sec > > > Ethtool(i40e1 ) stat: 42363950 ( 42,363,950) <= > > > port.rx_unicast /sec > > > Ethtool(i40e1 ) stat: 2165051990 ( 2,165,051,990) <= rx- > > > 1.bytes /sec > > > Ethtool(i40e1 ) stat: 36084200 ( 36,084,200) <= rx- > > > 1.packets /sec > > > Ethtool(i40e1 ) stat: 5385 ( 5,385) <= > > > rx_dropped /sec > > > Ethtool(i40e1 ) stat: 36089727 ( 36,089,727) <= > > > rx_unicast /sec > > > > > > > > > We really need consistency in the counters and at a minimum, > > > users > > > should be able to track packet and byte counters for both Rx and > > > Tx > > > including XDP. > > > > > > It seems to me the Rx and Tx packet, byte and dropped counters > > > returned > > > for the standard device stats (/proc/net/dev, ip -s li show, ...) > > > should > > > include all packets managed by the driver regardless of whether > > > they are > > > forwarded / dropped in XDP or go up the Linux stack. This also > > > aligns > > > with mlxsw and the stats it shows which are packets handled by > > > the hardware. > > > > > > From there the private stats can include XDP specifics as > > > desired -- > > > like the drops and redirects but that those should be add-ons and > > > even > > > here some consistency makes life easier for users. > > > > > > The same standards should be also be applied to virtual devices > > > built on > > > top of the ports -- e.g, vlans. I have an API now that allows > > > bumping > > > stats for vlan devices. > > > > > > Keeping the basic xdp packets in the standard counters allows > > > Paweł, for > > > example, to continue to monitor /proc/net/dev. > > > > > > Can we get agreement on this? And from there, get updates to the > > > mlx5 > > > and virtio drivers? > > I'd say it sounds reasonable to include XDP in the normal traffic > > counters, but having the detailed XDP-specific counters is quite > > useful > > as well... So can't we do both (for all drivers)? > > What are you thinking ? reporting XDP_DROP in interface dropped counter ? and XDP_TX/REDIRECT in the TX counter ? XDP_ABORTED in the err/drop counter ? how about having a special XDP command in the .ndo_bpf that would query the standardized XDP stats ? > > -Toke > >