> On Feb 9, 2024, at 2:41 PM, Wathsala Vithanage <wathsala.vithan...@arm.com> > wrote: > > Versions of Mellanox NICs starting from CX5 have device counters > related to PCI. These counters are helpful in debugging IO > bottlenecks. For instance, the outbound_pci_stalled_rd and > outbound_pci_stalled_wr counters can help with identifying NIC > stalls due to insufficient PCI credits, which otherwise would > have required a PCI analyzer or a sophisticated PCI root port > with a PMU. > Currently none of these are available in the MLX5 PMD even > though ethtool is capable of reading some of them. > Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and > reads via the kernel driver it is possible to add support with > ease. > There is one more PCI related counter and a device counter that > aren't implemented in the Linux driver at the moment. These two > are named outbound_pci_buffer_overflow and dev_out_of_buffer > respectively. As per Nvidia's documentation these two counters > can tell the number of packets dropped due to pci buffer > overflow and the number of times the device owned queue had not > enough buffers allocated. It would be good to see more of the PCI counters added in other NIC drivers as well. It helps significantly with debugging.
> > Signed-off-by: Wathsala Vithanage <wathsala.vithan...@arm.com> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > --- > .mailmap | 1 + > drivers/net/mlx5/linux/mlx5_ethdev_os.c | 33 +++++++++++++++++++++++++ > 2 files changed, 34 insertions(+) > > diff --git a/.mailmap b/.mailmap > index aa569ff456..f57415f7a1 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1510,6 +1510,7 @@ Walter Heymans <walter.heym...@corigine.com> > Wang Sheng-Hui <shh...@gmail.com> > Wangyu (Eric) <seven.wan...@huawei.com> > Waterman Cao <waterman....@intel.com> > +Wathsala Vithanage <wathsala.vithan...@arm.com> > Weichun Chen <weichunx.c...@intel.com> > Wei Dai <wei....@intel.com> > Weifeng Li <liweifen...@126.com> > diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c > b/drivers/net/mlx5/linux/mlx5_ethdev_os.c > index dd5a0c546d..8f1567f6a7 100644 > --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c > +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c > @@ -1574,6 +1574,39 @@ static const struct mlx5_counter_ctrl > mlx5_counters_init[] = { > .dpdk_name = "tx_vport_bytes", > .ctr_name = "vport_tx_bytes", > }, > + /* Device counters */ > + { > + .dpdk_name = "rx_pci_signal_integrity", > + .ctr_name = "rx_pci_signal_integrity", > + }, > + { > + .dpdk_name = "tx_pci_signal_integrity", > + .ctr_name = "tx_pci_signal_integrity", > + }, > + { > + .dpdk_name = "outbound_pci_buffer_overflow", > + .ctr_name = "outbound_pci_buffer_overflow", > + }, > + { > + .dpdk_name = "outbound_pci_stalled_rd", > + .ctr_name = "outbound_pci_stalled_rd", > + }, > + { > + .dpdk_name = "outbound_pci_stalled_wr", > + .ctr_name = "outbound_pci_stalled_wr", > + }, > + { > + .dpdk_name = "outbound_pci_stalled_rd_events", > + .ctr_name = "outbound_pci_stalled_rd_events", > + }, > + { > + .dpdk_name = "outbound_pci_stalled_wr_events", > + .ctr_name = "outbound_pci_stalled_wr_events", > + }, > + { > + .dpdk_name = "dev_out_of_buffer", > + .ctr_name = "dev_out_of_buffer", > + }, > }; > > static const unsigned int xstats_n = RTE_DIM(mlx5_counters_init); > -- > 2.25.1 >