> On Feb 13, 2024, at 7:12 AM, Slava Ovsiienko <viachesl...@nvidia.com> wrote: > > Hi, > > Regarding "dev_out_of_buffer" - it is global counter, relates to the whole > device port, > Including queues not managed by DPDK application - Mellanox/Nvidia NICs > operate > In "bifurcated mode" - there might be queues managed by kernel or another DPDK > application. Not sure it makes a lot of sense, but I have no strong > objections. These are still helpful to debug in lab environment. But, it would be good to document these.
> > The PCI related counters are also global ones and reflect statistics, > impacted by > PCI activity of the whole physical device, including all the network ports > located > on the same NIC board (and, sometimes, by internal activity in BlueField). > > As I said, no objections from my side: > > Acked-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> > > With best regards, > Slava > >> -----Original Message----- >> From: Wathsala Vithanage <wathsala.vithan...@arm.com> >> Sent: Friday, February 9, 2024 10:42 PM >> To: NBU-Contact-Thomas Monjalon (EXTERNAL) <tho...@monjalon.net>; >> Dariusz Sosnowski <dsosnow...@nvidia.com>; Slava Ovsiienko >> <viachesl...@nvidia.com>; Ori Kam <or...@nvidia.com>; Suanming Mou >> <suanmi...@nvidia.com>; Matan Azrad <ma...@nvidia.com> >> Cc: dev@dpdk.org; n...@arm.com; Wathsala Vithanage >> <wathsala.vithan...@arm.com>; Honnappa Nagarahalli >> <honnappa.nagaraha...@arm.com> >> Subject: [PATCH] net/mlx5: enable PCI related counters >> >> Versions of Mellanox NICs starting from CX5 have device counters related to >> PCI. >> These counters are helpful in debugging IO bottlenecks. For instance, the >> outbound_pci_stalled_rd and outbound_pci_stalled_wr counters can help with >> identifying NIC stalls due to insufficient PCI credits, which otherwise >> would have >> required a PCI analyzer or a sophisticated PCI root port with a PMU. >> Currently none of these are available in the MLX5 PMD even though ethtool is >> capable of reading some of them. >> Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and reads via the >> kernel driver it is possible to add support with ease. >> There is one more PCI related counter and a device counter that aren't >> implemented in the Linux driver at the moment. These two are named >> outbound_pci_buffer_overflow and dev_out_of_buffer respectively. As per >> Nvidia's documentation these two counters can tell the number of packets >> dropped due to pci buffer overflow and the number of times the device owned >> queue had not enough buffers allocated. >> >> Signed-off-by: Wathsala Vithanage <wathsala.vithan...@arm.com> >> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> >> --- >> .mailmap | 1 + >> drivers/net/mlx5/linux/mlx5_ethdev_os.c | 33 >> +++++++++++++++++++++++++ >> 2 files changed, 34 insertions(+) >> >> diff --git a/.mailmap b/.mailmap >> index aa569ff456..f57415f7a1 100644 >> --- a/.mailmap >> +++ b/.mailmap >> @@ -1510,6 +1510,7 @@ Walter Heymans <walter.heym...@corigine.com> >> Wang Sheng-Hui <shh...@gmail.com> Wangyu (Eric) >> <seven.wan...@huawei.com> Waterman Cao <waterman....@intel.com> >> +Wathsala Vithanage <wathsala.vithan...@arm.com> >> Weichun Chen <weichunx.c...@intel.com> >> Wei Dai <wei....@intel.com> >> Weifeng Li <liweifen...@126.com> >> diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c >> b/drivers/net/mlx5/linux/mlx5_ethdev_os.c >> index dd5a0c546d..8f1567f6a7 100644 >> --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c >> +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c >> @@ -1574,6 +1574,39 @@ static const struct mlx5_counter_ctrl >> mlx5_counters_init[] = { >> .dpdk_name = "tx_vport_bytes", >> .ctr_name = "vport_tx_bytes", >> }, >> + /* Device counters */ >> + { >> + .dpdk_name = "rx_pci_signal_integrity", >> + .ctr_name = "rx_pci_signal_integrity", >> + }, >> + { >> + .dpdk_name = "tx_pci_signal_integrity", >> + .ctr_name = "tx_pci_signal_integrity", >> + }, >> + { >> + .dpdk_name = "outbound_pci_buffer_overflow", >> + .ctr_name = "outbound_pci_buffer_overflow", >> + }, >> + { >> + .dpdk_name = "outbound_pci_stalled_rd", >> + .ctr_name = "outbound_pci_stalled_rd", >> + }, >> + { >> + .dpdk_name = "outbound_pci_stalled_wr", >> + .ctr_name = "outbound_pci_stalled_wr", >> + }, >> + { >> + .dpdk_name = "outbound_pci_stalled_rd_events", >> + .ctr_name = "outbound_pci_stalled_rd_events", >> + }, >> + { >> + .dpdk_name = "outbound_pci_stalled_wr_events", >> + .ctr_name = "outbound_pci_stalled_wr_events", >> + }, >> + { >> + .dpdk_name = "dev_out_of_buffer", >> + .ctr_name = "dev_out_of_buffer", >> + }, >> }; >> >> static const unsigned int xstats_n = RTE_DIM(mlx5_counters_init); >> -- >> 2.25.1 >