From: Ido Schimmel <ido...@nvidia.com> Amit says:
An overheated transceiver can be the root cause of various network problems such as link flapping. Counting the number of times a transceiver's temperature was higher than its configured threshold can therefore help in debugging such issues. This patch set exposes a transceiver overheat counter via ethtool. This is achieved by configuring the Spectrum ASIC to generate events whenever a transceiver is overheated. The temperature thresholds are queried from the transceiver (if available) and set to the default otherwise. Example: # ethtool -S swp1 ... transceiver_overheat: 2 Patch set overview: Patches #1-#3 add required device registers Patches #4-#5 add required infrastructure in mlxsw to configure and count overheat events Patches #6-#9 gradually add support for the transceiver overheat counter Patch #10 exposes the transceiver overheat counter via ethtool Amit Cohen (10): mlxsw: reg: Add Management Temperature Warning Event Register mlxsw: reg: Add Port Module Plug/Unplug Event Register mlxsw: reg: Add Ports Module Administrative and Operational Status Register mlxsw: core_hwmon: Query MTMP before writing to set only relevant fields mlxsw: core: Add an infrastructure to track transceiver overheat counter mlxsw: Update transceiver_overheat counter according to MTWE mlxsw: Enable temperature event for all supported port module sensors mlxsw: spectrum: Initialize netdev's module overheat counter mlxsw: Update module's settings when module is plugged in mlxsw: spectrum_ethtool: Expose transceiver_overheat counter drivers/net/ethernet/mellanox/mlxsw/core.c | 27 ++ drivers/net/ethernet/mellanox/mlxsw/core.h | 5 + .../net/ethernet/mellanox/mlxsw/core_env.c | 368 ++++++++++++++++++ .../net/ethernet/mellanox/mlxsw/core_env.h | 6 + .../net/ethernet/mellanox/mlxsw/core_hwmon.c | 21 +- drivers/net/ethernet/mellanox/mlxsw/reg.h | 132 +++++++ .../net/ethernet/mellanox/mlxsw/spectrum.c | 44 +++ .../net/ethernet/mellanox/mlxsw/spectrum.h | 1 + .../mellanox/mlxsw/spectrum_ethtool.c | 57 ++- drivers/net/ethernet/mellanox/mlxsw/trap.h | 4 + 10 files changed, 660 insertions(+), 5 deletions(-) -- 2.26.2