On 24-Jun-20 22:07, Jacob Keller wrote: > > > On 6/24/2020 1:19 AM, Ido Schimmel wrote: >> From: Amit Cohen <am...@mellanox.com> >> >> Add link extended state attributes. >> >> Signed-off-by: Amit Cohen <am...@mellanox.com> >> Reviewed-by: Petr Machata <pe...@mellanox.com> >> Reviewed-by: Jiri Pirko <j...@mellanox.com> >> Signed-off-by: Ido Schimmel <ido...@mellanox.com> > > I really like this concept. > > Reviewed-by: Jacob Keller <jacob.e.kel...@intel.com> > >> --- >> Documentation/networking/ethtool-netlink.rst | 110 ++++++++++++++++++- >> 1 file changed, 106 insertions(+), 4 deletions(-) >> >> diff --git a/Documentation/networking/ethtool-netlink.rst >> b/Documentation/networking/ethtool-netlink.rst >> index 82470c36c27a..a7cc53f905f5 100644 >> --- a/Documentation/networking/ethtool-netlink.rst >> +++ b/Documentation/networking/ethtool-netlink.rst >> @@ -443,10 +443,11 @@ supports. >> LINKSTATE_GET >> ============= >> >> -Requests link state information. At the moment, only link up/down flag (as >> -provided by ``ETHTOOL_GLINK`` ioctl command) is provided but some future >> -extensions are planned (e.g. link down reason). This request does not have >> any >> -attributes. >> +Requests link state information. Link up/down flag (as provided by >> +``ETHTOOL_GLINK`` ioctl command) is provided. Optionally, extended state >> might >> +be provided as well. In general, extended state describes reasons for why a >> port >> +is down, or why it operates in some non-obvious mode. This request does not >> have >> +any attributes. >> > > Ok, so basically in addition to the standard ETHTOOL_GLINK data, we also > add additional optional extended attributes over the netlink interface. > Neat. > >> Request contents: >> >> @@ -461,16 +462,117 @@ Kernel response contents: >> ``ETHTOOL_A_LINKSTATE_LINK`` bool link state (up/down) >> ``ETHTOOL_A_LINKSTATE_SQI`` u32 Current Signal Quality Index >> ``ETHTOOL_A_LINKSTATE_SQI_MAX`` u32 Max support SQI value >> + ``ETHTOOL_A_LINKSTATE_EXT_STATE`` u8 link extended state >> + ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE`` u8 link extended substate >> ==================================== ====== ============================ > > Ok so we have categories (state) and each category can have sub-states > indicating the specific failure. Good. > >> >> For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns >> carrier flag provided by ``netif_carrier_ok()`` but there are drivers which >> define their own handler. >> >> +``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE`` >> are >> +optional values. ethtool core can provide either both >> +``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``, >> +or only ``ETHTOOL_A_LINKSTATE_EXT_STATE``, or none of them. >> + >> ``LINKSTATE_GET`` allows dump requests (kernel returns reply messages for >> all >> devices supporting the request). >> > > Good to clarify that it is allowed to specify only the main state, if a > substate isn't known. >
I described above all the options: "ethtool core can provide either (1) both ``ETHTOOL_A_LINKSTATE_EXT_STATE`` and ``ETHTOOL_A_LINKSTATE_EXT_SUBSTATE``, or (2) only ``ETHTOOL_A_LINKSTATE_EXT_STATE``, or (3) none of them" I think that #2 is what you want, no? >> >> +Link extended states: >> + >> + ============================ >> ============================================= >> + ``Autoneg`` States relating to the autonegotiation or >> + issues therein >> + >> + ``Link training failure`` Failure during link training >> + >> + ``Link logical mismatch`` Logical mismatch in physical coding >> sublayer >> + or forward error correction sublayer >> + >> + ``Bad signal integrity`` Signal integrity issues >> + >> + ``No cable`` No cable connected >> + >> + ``Cable issue`` Failure is related to cable, >> + e.g., unsupported cable >> + >> + ``EEPROM issue`` Failure is related to EEPROM, e.g., >> failure >> + during reading or parsing the data >> + >> + ``Calibration failure`` Failure during calibration algorithm >> + >> + ``Power budget exceeded`` The hardware is not able to provide the >> + power required from cable or module >> + >> + ``Overheat`` The module is overheated >> + ============================ >> ============================================= > > A nice variety of states. I think it could be argued that "no cable" is > a sub-state of "cable issues" but personally I like that it's separate. > > I can't think of any other states offhand, but we have a u8, so we have > plenty of space to add more states if/when we think of them in the future. > >> + >> +Link extended substates: >> + >> + Autoneg substates: >> + >> + ============================================== >> ============================================= >> + ``No partner detected`` Peer side is down >> + >> + ``Ack not received`` Ack not received from >> peer side >> + >> + ``Next page exchange failed`` Next page exchange >> failed >> + >> + ``No partner detected during force mode`` Peer side is down >> during force mode or there >> + is no agreement of speed >> + > > This feels like it could be two separate states. It seems weird to > combine "peer side is down during force" with "no agreement of speed". > Am I missing something here? > >From high-level API point of view it makes sense, but we get this reason from >our FW for both cases. When link is configured to force mode, it doesn't know his partner's state, so partner is down and partner has different speed are same. >> + ``FEC mismatch during override`` Forward error >> correction modes in both sides >> + are mismatched >> + >> + ``No HCD`` No Highest Common >> Denominator >> + ============================================== >> ============================================= >> + >> + Link training substates: >> + >> + ============================================== >> ============================================= >> + ``KR frame lock not acquired`` Frames were not >> recognized, the lock failed >> + >> + ``KR link inhibit timeout`` The lock did not occur >> before timeout >> + >> + ``KR Link partner did not set receiver ready`` Peer side did not send >> ready signal after >> + training process >> + >> + ``Remote side is not ready yet`` Remote side is not >> ready yet >> + >> + ============================================== >> ============================================= >> + >> + Link logical mismatch substates: >> + >> + ============================================== >> ============================================= >> + ``PCS did not acquire block lock`` Physical coding >> sublayer was not locked in >> + first phase - block lock >> + >> + ``PCS did not acquire AM lock`` Physical coding >> sublayer was not locked in >> + second phase - >> alignment markers lock >> + >> + ``PCS did not get align_status`` Physical coding >> sublayer did not get align >> + status >> + >> + ``FC FEC is not locked`` FC forward error >> correction is not locked >> + >> + ``RS FEC is not locked`` RS forward error >> correction is not locked >> + ============================================== >> ============================================= >> + >> + Bad signal integrity substates: >> + >> + ============================================== >> ============================================= >> + ``Large number of physical errors`` Large number of >> physical errors >> + >> + ``Unsupported rate`` The system attempted to >> operate the cable at >> + a rate that is not >> formally supported, which >> + led to signal integrity >> issues >> + >> + ``Unsupported cable`` Unsupported cable >> + >> + ``Cable test failure`` Cable test failure >> + ============================================== >> ============================================= >> + > > Not every state has substates. Makes sense, since some of the main > states are pretty straight forward. > > This feels very promising, and enables providing some useful information > to users about why something isn't working as expected. I think it's a > significant improvement to the status quo, given that a device can > report this data. > > Thanks, > Jake > Thanks for the comments. >> DEBUG_GET >> ========= >> >>