On Mon, Jan 28, 2019 at 3:45 PM Jakub Kicinski <jakub.kicin...@netronome.com> wrote: > > Hi! > > As I tried to explain in my slides at netconf 2018 we are lacking > an expressive, standard API to report device statistics. > > Networking silicon generally maintains some IEEE 802.3 and/or RMON > statistics. Today those all end up in ethtool -S. Here is a simple > attempt (admittedly very imprecise) of counting how many names driver > authors invented for IETF RFC2819 etherStatsPkts512to1023Octets > statistics (RX and TX): > > $ git grep '".*512.*1023.*"' -- drivers/net/ | \ > sed -e 's/.*"\(.*\)".*/\1/' | sort | uniq | wc -l > 63 > > Interestingly only two drivers in the tree use the name the standard > gave us (etherStatsPkts512to1023, modulo case). > > I set out to working on this set in an attempt to give drivers a way > to express clearly to user space standard-compliant counters. > > Second most common use for custom statistics is per-queue counters. > This is where the "hierarchical" part of this set comes in, as > groups can be nested, and user space tools can handle the aggregation > inside the groups if needed. > > This set also tries to address the problem of users not knowing if > a statistic is reported by hardware or the driver. Many modern drivers > use some prefix in ethtool -S to indicate MAC/PHY stats. At a quick > glance: Netronome uses "mac.", Intel "port." and Mellanox "_phy". > In this set, netlink attributes describe whether a group of statistics > is RX or TX, maintained by device or driver. > > The purpose of this patch set is _not_ to replace ethtool -S. It is > an incredibly useful tool, and we will certainly continue using it. > However, for standard-based and commonly maintained statistics a more > structured API seems warranted. > > There are two things missing from these patches, which I initially > planned to address as well: filtering, and refresh rate control. > > Filtering doesn't need much explanation, users should be able to request > only a subset of statistics (like only SW stats or only given ID). The > bitmap of statistics in each group is there for filtering later on. > > By refresh control I mean the ability for user space to indicate how > "fresh" values it expects. Sometimes reading the HW counters requires > slow register reads or FW communication, in such cases drivers may cache > the result. (Privileged) user space should be able to add a "not older > than" timestamp to indicate how fresh statistics it expects. And vice > versa, drivers can then also put the timestamp of when the statistics > were last refreshed in the dump for more precise bandwidth estimation.
Jakub, Glad to see hw stats in the RTM_*STATS api. I do see you mention 'partial' support for ethtool stats. I understand the reason you say its partial. But while at it, why not also include the ability to have driver extensible stats here ? ie make it complete. We have talked about making all hw stats available via the RTM_*STATS api in the past..., so just want to make sure the new HSTATS infra you are adding to the RTM_*STATS api covers or at-least makes it possible to include driver extensible stats in the future where the driver gets to define the stats id + value (This is very useful). It would be nice if you can account for that in this new HSTATS API. > > Jakub Kicinski (14): > nfp: remove unused structure > nfp: constify parameter to nfp_port_from_netdev() > net: hstats: add basic/core functionality > net: hstats: allow hierarchies to be built > nfp: very basic hstat support > net: hstats: allow iterators > net: hstats: help in iteration over directions > nfp: hstats: make use of iteration for direction > nfp: hstats: add driver and device per queue statistics > net: hstats: add IEEE 802.3 and common IETF MIB/RMON stats > nfp: hstats: add IEEE/RMON ethernet port/MAC stats > net: hstats: add markers for partial groups > nfp: hstats: add a partial group of per-8021Q prio stats > Documentation: networking: describe new hstat API > > Documentation/networking/hstats.rst | 590 +++++++++++++++ > .../networking/hstats_flow_example.dot | 11 + > Documentation/networking/index.rst | 1 + > drivers/net/ethernet/netronome/nfp/Makefile | 1 + > .../net/ethernet/netronome/nfp/nfp_hstat.c | 474 ++++++++++++ > drivers/net/ethernet/netronome/nfp/nfp_main.c | 1 + > drivers/net/ethernet/netronome/nfp/nfp_main.h | 2 + > drivers/net/ethernet/netronome/nfp/nfp_net.h | 10 +- > .../ethernet/netronome/nfp/nfp_net_common.c | 1 + > .../net/ethernet/netronome/nfp/nfp_net_repr.h | 2 +- > drivers/net/ethernet/netronome/nfp/nfp_port.c | 2 +- > drivers/net/ethernet/netronome/nfp/nfp_port.h | 2 +- > include/linux/netdevice.h | 9 + > include/net/hstats.h | 176 +++++ > include/uapi/linux/if_link.h | 107 +++ > net/core/Makefile | 2 +- > net/core/hstats.c | 682 ++++++++++++++++++ > net/core/rtnetlink.c | 21 + > 18 files changed, 2084 insertions(+), 10 deletions(-) > create mode 100644 Documentation/networking/hstats.rst > create mode 100644 Documentation/networking/hstats_flow_example.dot > create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_hstat.c > create mode 100644 include/net/hstats.h > create mode 100644 net/core/hstats.c > > -- > 2.19.2 >