On Mon, Feb 22, 2016 at 07:31:55PM +0100, Jiri Pirko wrote: > From: Jiri Pirko <j...@mellanox.com> > > There a is need for some userspace API that would allow to expose things > that are not directly related to any device class like net_device of > ib_device, but rather chip-wide/switch-ASIC-wide stuff. > > Use cases: > 1) get/set of port type (Ethernet/InfiniBand) > 2) setting up port splitters - split port into multiple ones and squash again, > enables usage of splitter cable > 3) setting up shared buffers - shared among multiple ports within > one chip (work in progress) > 4) configuration of switch wide properties - resources division etc - This > will > allow to pass configuration that is unacceptable to be passed as > a module option.
I'm generally a fan of use cases #3 and #4 (as we have previously discussed), but I'm not sure I agree that the implementation for #2 right now. I'm not sure I would like userspace to have control over whether or not a port should be split or not when the hardware can be queried to determine this. > First patch of this set introduces a new generic Netlink based interface, > called "devlink". It is similar to nl80211 model and it is heavily > influenced by it, including the API definition. The devlink introduction patch > implements use cases 1) and 2). Other 2 are in development atm and will > be addressed by follow-ups. > > It is very convenient for drivers to use devlink, as you can see in other > patches in this set. > > Counterpart for devlink is userspace tool for now called "dl". Command line > interface and outputs are derived from "ip" tool so it should be easy > for users to get used to it. > > It is available here as a standalone tool for now: > https://github.com/jpirko/devlink > After this is merge in kernel, I will include the "dl" or "devlink" tool > into iproute2 toolset. > > Port type setting example: > myhost:~$ dl help > Usage: dl [ OPTIONS ] OBJECT { COMMAND | help } > where OBJECT := { dev | port | monitor } > OPTIONS := { -v/--verbose } > > myhost:~$ dl dev help > Usage: dl dev show [DEV] > Usage: dl dev set DEV [ name NEWNAME ] > > myhost:~$ dl dev show > 0: devlink0: bus pci dev 0000:01:00.0 > > myhost:~$ dl port help > Usage: dl port show [DEV/PORT_INDEX] > Usage: dl port set DEV/PORT_INDEX [ type { eth | ib | auto} ] > Usage: dl port split DEV/PORT_INDEX count > Usage: dl port unsplit DEV/PORT_INDEX > > myhost:~$ dl port show > devlink0/1: type ib ibdev mlx4_0 > devlink0/2: type ib ibdev mlx4_0 > > myhost:~$ sudo dl port set devlink0/1 type eth > > myhost:~$ dl port show > devlink0/1: type eth netdev ens4 ^^^^^^^^^^^ > devlink0/2: type ib ibdev mlx4_0 ^^^^^^^^^^^^ I think my only other question about this implementation is whether or not one would really want to have the true netdev/ibdev names mapped here. Would be as reasonable to simply specify the type (and there may be more types within ethernet that could be useful in multi-chip configurations) and then let normal infrastructure that exists today figure out how to map the names for the netdevs to the devices? > myhost:~$ sudo dl port set devlink0/2 type auto > > myhost:~$ dl port show > devlink0/1: type eth netdev ens4 > devlink0/2: type ib(auto) ibdev mlx4_0 > > Port splitting example: > myswitch:~$ dl port > devlink0/1: type eth netdev eth0 > devlink0/3: type eth netdev eth1 > devlink0/5: type eth netdev eth2 > ... > devlink0/63: type eth netdev eth31 > > myswitch:~$ sudo dl port split devlink0/1 2 > > myswitch:~$ dl port > devlink0/3: type eth netdev eth1 > devlink0/5: type eth netdev eth2 > ... > devlink0/63: type eth netdev eth31 > devlink0/1: type eth netdev eth0 split_group 16 > devlink0/2: type eth netdev eth32 split_group 16 > > myswitch:~$ sudo dl port unsplit devlink0/1 > > myswitch:~$ dl port > devlink0/3: type eth netdev eth1 > devlink0/5: type eth netdev eth2 > ... > devlink0/63: type eth netdev eth31 > devlink0/1: type eth netdev eth0 > > Ido Schimmel (4): > mlxsw: spectrum: Unmap local port from module during teardown > mlxsw: spectrum: Store local port to module mapping during init > mlxsw: spectrum: Mark unused ports using NULL > mlxsw: spectrum: Introduce port splitting > > Jiri Pirko (5): > Introduce devlink infrastructure > mlx4: Implement devlink interface > mlx4: Implement port type setting via devlink interface > mlxsw: Implement devlink interface > mlxsw: core: Add devlink port splitter callbacks > > MAINTAINERS | 8 + > drivers/infiniband/hw/mlx4/main.c | 7 + > drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 8 +- > drivers/net/ethernet/mellanox/mlx4/intf.c | 9 + > drivers/net/ethernet/mellanox/mlx4/main.c | 129 +++- > drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 + > drivers/net/ethernet/mellanox/mlxsw/core.c | 56 +- > drivers/net/ethernet/mellanox/mlxsw/core.h | 2 + > drivers/net/ethernet/mellanox/mlxsw/port.h | 2 + > drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 238 ++++++- > drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 8 +- > drivers/net/ethernet/mellanox/mlxsw/switchx2.c | 20 + > include/linux/mlx4/driver.h | 3 + > include/net/devlink.h | 156 +++++ > include/uapi/linux/devlink.h | 73 ++ > net/Kconfig | 7 + > net/core/Makefile | 1 + > net/core/devlink.c | 887 > +++++++++++++++++++++++++ > 18 files changed, 1557 insertions(+), 59 deletions(-) > create mode 100644 include/net/devlink.h > create mode 100644 include/uapi/linux/devlink.h > create mode 100644 net/core/devlink.c > > -- > 2.5.0 >