On Wed, Aug 1, 2018 at 4:13 PM, Saeed Mahameed
<sae...@dev.mellanox.co.il> wrote:
On Wed, Aug 1, 2018 at 3:34 PM, Alexander Duyck
<alexander.du...@gmail.com> wrote:
On Wed, Aug 1, 2018 at 2:52 PM, Saeed Mahameed <sae...@mellanox.com> wrote:
Hi Dave,

This series provides devlink parameters updates to both devlink API and mlx5 driver, it is a 2nd iteration of the dropped patches sent in a previous
mlx5 submission "net/mlx5: Support PCIe buffer congestion handling via
Devlink" to address review comments [1].

Changes from the original series:
- According to the discussion outcome, we are keeping the congestion control
  setting as mlx5 device specific for the current HW generation.
- Changed the congestion_mode and congestion action param type to string
- Added patches to fix devlink handling of param type string
- Added a patch which adds extack messages support for param set.
- At the end of this series, I've added yet another mlx5 devlink related
 feature, firmware snapshot support.

For more information please see tag log below.

Please pull and let me know if there's any problem.

[1] https://patchwork.ozlabs.org/patch/945996/

Thanks,
Saeed.

---

The following changes since commit e6476c21447c4b17c47e476aade6facf050f31e8:

  net: remove bogus RCU annotations on socket.wq (2018-07-31 12:40:22 -0700)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2018-08-01

for you to fetch changes up to 2ac6108c65ffcb1e5eab1fba1fd59272604d1c32:

  net/mlx5: Use devlink region_snapshot parameter (2018-08-01 14:49:09 -0700)

----------------------------------------------------------------
mlx5-updates-2018-08-01

This series provides devlink parameters updates to both devlink API and
mlx5 driver,

1) Devlink changes: (Moshe Shemesh)
The first two patches fix devlink param infrastructure for string type
params.
The third patch adds a devlink helper function to safely copy string from
driver to devlink.
The forth patch adds extack support for param set.

2) mlx5 specific congestion parameters: (Eran Ben Elisha)
Next three patches add new devlink driver specific params for controlling congestion action and mode, using string type params and extack messages support.

This congestion mode enables hw workaround in specific devices which is
controlled by devlink driver-specific params. The workaround is device
specific for this NIC generation, the next NIC will not need it.

Congestion parameters:
 - Congestion action
            HW W/A mechanism in the PCIe buffer which monitors the amount of             consumed PCIe buffer per host.  This mechanism supports the
            following actions in case of threshold overflow:
            - Disabled - NOP (Default)
            - Drop
            - Mark - Mark CE bit in the CQE of received packet
    - Congestion mode
            - Aggressive - Aggressive static trigger threshold (Default)
            - Dynamic - Dynamically change the trigger threshold

3) mlx5 firmware snapshot support via devlink: (Alex Vesker)
Last three patches, add the support for capturing region snapshot of the
firmware crspace during critical errors, using devlink region_snapshot
parameter.

-Saeed.

----------------------------------------------------------------
Alex Vesker (3):
      net/mlx5: Add Vendor Specific Capability access gateway
      net/mlx5: Add Crdump FW snapshot support
      net/mlx5: Use devlink region_snapshot parameter

Eran Ben Elisha (3):
      net/mlx5: Move all devlink related functions calls to devlink.c
      net/mlx5: Add MPEGC register configuration functionality
      net/mlx5: Enable PCIe buffer congestion handling workaround via devlink

Moshe Shemesh (4):
      devlink: Fix param set handling for string type
      devlink: Fix param cmode driverinit for string type
      devlink: Add helper function for safely copy string param
      devlink: Add extack messages support to param set

 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c  |   3 +-
 drivers/net/ethernet/mellanox/mlx4/main.c          |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c  | 388 +++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/devlink.h  |  13 +
 .../net/ethernet/mellanox/mlx5/core/diag/crdump.c  | 223 ++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/health.c   |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h |   4 +
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c  | 320 +++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h  |  56 +++
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  10 +-
 include/linux/mlx5/driver.h                        |   5 +
 include/net/devlink.h                              |  15 +-
 net/core/devlink.c                                 |  44 ++-
 14 files changed, 1076 insertions(+), 17 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h

So after looking over the patch set the one thing I would ask for in
this is some sort of documentation at a minimum. As a user I don't see
how you can expect someone to be able to use this when the naming of
things are pretty cryptic and there is no real explanation anywhere if
you don't go through and read the patch description itself. When you
start adding driver specific interfaces, you should at least start
adding vendor specific documentation.


Sure, sounds like a great idea, something like:
Documentation/networking/mlx5.txt and have a devlink section ?
or have a generic devlink doc and a mlx5 section in it ?

Either would work for me.

For which patches are you missing documentation?

Also I don't see how using a vendor specific configuration space
section can be done without adding some tie-ins to the PCI core files
because it should be possible to race with someone poking at the
register space via something like setpci/lspci. Also one of the things
that came up was that drivers are not supposed to be banging on the
PCI configuration space at will, and it seems like this patch set is
doing exactly that through the VSC block.


this is a whole different feature than the device specific parameters.
The whole vendor specific configuration space access is needed only
for diagnostic/dump
purposes when something really bad happens and the command interface
with FW is down,
and when the FW is un-responsive, we want to dump the crspace into the
already existing devlink
crdump buffer, how do you expect us to read it if we are not allowed
to access it ?

What do you mean by tie-ins to the PCI core files ? can you please elaborate ?

You have added a vendor specific config section and you are using it
to access several of the pieces of metadata. The setup isn't too
different than the VPD setup and approach. However I don't see many of
the protections that exist for VPD in place for this vendor specific
configuration. As such I have concerns. For example what is to keep
requests to the various devlink interfaces from racing with each other
when they both end up operating through the VCS?

- Alex

Hi, I would like to resubmit the devlink region crdump support for mlx5,
which is part of this patch-set.

AlexD, regarding the protection, various devlink interfaces cannot race since devlink_mutex is used. The VSC access is also protected, using mlx5_vsc_gw_lock/unlock
only one can acquire the lock.
After explaining this I want to clarify something, the access to VSC is not user driven it happens automatically by the driver when an error is detected to collect a crdump.

Reply via email to