> On Feb 13, 2024, at 10:13 AM, Dariusz Sosnowski <dsosnow...@nvidia.com> wrote:
> 
>> -----Original Message-----
>> From: Stephen Hemminger <step...@networkplumber.org>
>> Sent: Saturday, February 10, 2024 02:33
>> To: Wathsala Vithanage <wathsala.vithan...@arm.com>
>> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>;
>> Dariusz Sosnowski <dsosnow...@nvidia.com>; Slava Ovsiienko
>> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; Suanming Mou
>> <suanmi...@nvidia.com>; Matan Azrad <ma...@nvidia.com>;
>> dev@dpdk.org; n...@arm.com; Honnappa Nagarahalli
>> <honnappa.nagaraha...@arm.com>
>> Subject: Re: [PATCH] net/mlx5: enable PCI related counters
>> 
>> On Fri,  9 Feb 2024 20:41:42 +0000
>> Wathsala Vithanage <wathsala.vithan...@arm.com> wrote:
>> 
>>> Versions of Mellanox NICs starting from CX5 have device counters
>>> related to PCI. These counters are helpful in debugging IO
>>> bottlenecks. For instance, the outbound_pci_stalled_rd and
>>> outbound_pci_stalled_wr counters can help with identifying NIC stalls
>>> due to insufficient PCI credits, which otherwise would have required a
>>> PCI analyzer or a sophisticated PCI root port with a PMU.
>>> Currently none of these are available in the MLX5 PMD even though
>>> ethtool is capable of reading some of them.
>>> Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and reads
>>> via the kernel driver it is possible to add support with ease.
>>> There is one more PCI related counter and a device counter that aren't
>>> implemented in the Linux driver at the moment. These two are named
>>> outbound_pci_buffer_overflow and dev_out_of_buffer respectively. As
>>> per Nvidia's documentation these two counters can tell the number of
>>> packets dropped due to pci buffer overflow and the number of times the
>>> device owned queue had not enough buffers allocated.
>>> 
>>> Signed-off-by: Wathsala Vithanage <wathsala.vithan...@arm.com>
>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
>> 
>> Would it be possible to do this at PCI bus layer so all PCI devices have that
>> feature?
> PCIe performance counters mentioned here are exposed by the NIC itself and 
> mlx5 kernel driver just passes them to userspace.
> If such a feature would be added at PCI bus layer, we would need to use (or 
> add) some additional infrastructure.
> I'm not familiar with what Linux kernel exposes in terms of PCI counters. 
> It's worth looking into.
> I'd assume such data can probably be extracted through PMU.
In our investigation, we did not find anything that Linux provides in terms of 
PCIe PMUs on PCIe root port. The best we found was these PCIe counters as seen 
by NIC.

It would be good to see other NICs providing similar and additional counters if 
any.



> 
> Best regards,
> Dariusz Sosnowski

Reply via email to