> On Feb 13, 2024, at 7:12 AM, Slava Ovsiienko <viachesl...@nvidia.com> wrote:
> 
> Hi,
> 
> Regarding "dev_out_of_buffer" - it is global counter, relates to the whole 
> device port,
> Including queues not managed by DPDK application - Mellanox/Nvidia NICs 
> operate
> In "bifurcated mode" - there might be queues managed by kernel or another DPDK
> application. Not sure it makes a lot of sense, but I have no strong 
> objections.
These are still helpful to debug in lab environment. But, it would be good to 
document these.

> 
> The PCI related counters are also global ones and reflect statistics, 
> impacted by
> PCI activity of the whole physical device, including all the network ports 
> located
> on the same NIC board (and, sometimes, by internal activity in BlueField).
> 
> As I said, no objections from my side:
> 
> Acked-by: Viacheslav Ovsiienko <viachesl...@nvidia.com>
> 
> With best regards,
> Slava
> 
>> -----Original Message-----
>> From: Wathsala Vithanage <wathsala.vithan...@arm.com>
>> Sent: Friday, February 9, 2024 10:42 PM
>> To: NBU-Contact-Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>;
>> Dariusz Sosnowski <dsosnow...@nvidia.com>; Slava Ovsiienko
>> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; Suanming Mou
>> <suanmi...@nvidia.com>; Matan Azrad <ma...@nvidia.com>
>> Cc: dev@dpdk.org; n...@arm.com; Wathsala Vithanage
>> <wathsala.vithan...@arm.com>; Honnappa Nagarahalli
>> <honnappa.nagaraha...@arm.com>
>> Subject: [PATCH] net/mlx5: enable PCI related counters
>> 
>> Versions of Mellanox NICs starting from CX5 have device counters related to 
>> PCI.
>> These counters are helpful in debugging IO bottlenecks. For instance, the
>> outbound_pci_stalled_rd and outbound_pci_stalled_wr counters can help with
>> identifying NIC stalls due to insufficient PCI credits, which otherwise 
>> would have
>> required a PCI analyzer or a sophisticated PCI root port with a PMU.
>> Currently none of these are available in the MLX5 PMD even though ethtool is
>> capable of reading some of them.
>> Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and reads via the
>> kernel driver it is possible to add support with ease.
>> There is one more PCI related counter and a device counter that aren't
>> implemented in the Linux driver at the moment. These two are named
>> outbound_pci_buffer_overflow and dev_out_of_buffer respectively. As per
>> Nvidia's documentation these two counters can tell the number of packets
>> dropped due to pci buffer overflow and the number of times the device owned
>> queue had not enough buffers allocated.
>> 
>> Signed-off-by: Wathsala Vithanage <wathsala.vithan...@arm.com>
>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
>> ---
>> .mailmap                                |  1 +
>> drivers/net/mlx5/linux/mlx5_ethdev_os.c | 33
>> +++++++++++++++++++++++++
>> 2 files changed, 34 insertions(+)
>> 
>> diff --git a/.mailmap b/.mailmap
>> index aa569ff456..f57415f7a1 100644
>> --- a/.mailmap
>> +++ b/.mailmap
>> @@ -1510,6 +1510,7 @@ Walter Heymans <walter.heym...@corigine.com>
>> Wang Sheng-Hui <shh...@gmail.com>  Wangyu (Eric)
>> <seven.wan...@huawei.com>  Waterman Cao <waterman....@intel.com>
>> +Wathsala Vithanage <wathsala.vithan...@arm.com>
>> Weichun Chen <weichunx.c...@intel.com>
>> Wei Dai <wei....@intel.com>
>> Weifeng Li <liweifen...@126.com>
>> diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> index dd5a0c546d..8f1567f6a7 100644
>> --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> @@ -1574,6 +1574,39 @@ static const struct mlx5_counter_ctrl
>> mlx5_counters_init[] = {
>> .dpdk_name = "tx_vport_bytes",
>> .ctr_name = "vport_tx_bytes",
>> },
>> + /* Device counters */
>> + {
>> + .dpdk_name = "rx_pci_signal_integrity",
>> + .ctr_name = "rx_pci_signal_integrity",
>> + },
>> + {
>> + .dpdk_name = "tx_pci_signal_integrity",
>> + .ctr_name = "tx_pci_signal_integrity",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_buffer_overflow",
>> + .ctr_name = "outbound_pci_buffer_overflow",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_rd",
>> + .ctr_name = "outbound_pci_stalled_rd",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_wr",
>> + .ctr_name = "outbound_pci_stalled_wr",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_rd_events",
>> + .ctr_name = "outbound_pci_stalled_rd_events",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_wr_events",
>> + .ctr_name = "outbound_pci_stalled_wr_events",
>> + },
>> + {
>> + .dpdk_name = "dev_out_of_buffer",
>> + .ctr_name = "dev_out_of_buffer",
>> + },
>> };
>> 
>> static const unsigned int xstats_n = RTE_DIM(mlx5_counters_init);
>> --
>> 2.25.1
> 

Reply via email to