Hi Jake, > From: netdev-ow...@vger.kernel.org <netdev-ow...@vger.kernel.org> On > Behalf Of Jacob Keller > > > On 5/17/2020 11:52 PM, Jiri Pirko wrote: > > Fri, May 15, 2020 at 11:36:19PM CEST, jacob.e.kel...@intel.com wrote: > >> > >> > >> On 5/15/2020 2:30 AM, Jiri Pirko wrote: > >>> Fri, May 15, 2020 at 01:52:54AM CEST, jacob.e.kel...@intel.com wrote: > >>>>> $ devlink port add pci/0000.06.00.0/100 flavour pcisf pfnum 1 > >>>>> sfnum 10 > >>>>> > >>>> > >>>> Can you clarify what sfnum means here? and why is it different from > >>>> the index? I get that the index is a unique number that identifies > >>>> the port regardless of type, so sfnum must be some sort of hardware > >>>> internal identifier? > >>> > >>> Basically pfnum, sfnum and vfnum could overlap. Index is unique > >>> within all groups together. > >>> > >> > >> Right. Index is just an identifier for which port this is. > >> > > Ok, so whether or not a driver uses this internally is an implementation > detail that doesn't matter to the interface. > > > >>> > >>>> > >>>> When looking at this with colleagues, there was a lot of confusion > >>>> about the difference between the index and the sfnum. > >>> > >>> No confusion about index and pfnum/vfnum? They behave the same. > >>> Index is just a port handle. > >>> > >> > >> I'm less confused about the difference between index and these > >> "nums", and more so questioning what pfnum/vfnum/sfnum represent? > Are > >> they similar to the vf ID that we have in the legacy SRIOV functions? > >> I.e. a hardware index? > >> > >> I don't think in general users necessarily care which "index" they > >> get upfront. They obviously very much care about the index once it's > >> selected. I do believe the interfaces should start with the > >> capability for the index to be selected automatically at creation > >> (with the optional capability to select a specific index if desired, as > >> shown > here). > >> > >> I do not think most users want to care about what to pick for this > >> number. (Just as they would not want to pick a number for the port > >> index either). > > > > I see your point. However I don't think it is always the right > > scenario. The "nums" are used for naming of the netdevices, both the > > eswitch port representor and the actual SF (in case of SF). > > > > I think that in lot of usecases is more convenient for user to select > > the "num" on the cmdline. > > > > Agreed, based on the below statements. Basically "let users specify or get it > automatically chosen", just like with the port identifier and with the region > numbers now. > > > Thanks for the explanations! > > >> > >>>> Obviously this is a TODO, but how does this differ from the current > >>>> port_split and port_unsplit? > >>> > >>> Does not have anything to do with port splitting. This is about > >>> creating a "child PF" from the section above. > >>> > >> > >> Hmm. Ok so this is about internal connections in the switch, then? > > > > Yes. Take the smartnic as an example. On the smartnic cpu, the eswitch > > management is being done. There's devlink instance with all eswitch > > port visible as devlink ports. One PF-type devlink port per host. That > > are the "child PFs". > > > > Now from perspective of the host, there are 2 scenarios: > > 1) have the "simple dumb" PF, which just exposes 1 netdev for host to > > run traffic over. smartnic cpu manages the VFs/SFs and sees the > > devlink ports for them. This is 1 level switch - merged switch > > > > 2) PF manages a sub-switch/nested-switch. The devlink/devlink ports are > > created on the host and the devlink ports for SFs/VFs are created > > there. This is multi-level eswitch. Each "child PF" on a parent > > manages a nested switch. And could in theory have other PF child with > > another nested switch. > > > > Ok. So in the smart NIC CPU, we'd see the primary PF and some child PFs, > and in the host system we'd see a "primary PF" that is the other end of the > associated Child PF, and might be able to manage its own subswitch. > > Ok this is making more sense now. > > I think I had imagined that was what subfuntions were. But really > subfunctions are a bit different, they're more similar to expanded VFs? > 1. Sub functions are more light weight than VFs because, 2. They share the same PCI device (BAR, IRQs) as that of PF/VF on which it is deployed. 3. Unlike VFs which are enabled/disabled in bulk, subfunctions are created, deployed in unit of 1.
Since this RFC content is overwhelming, I expanded the SF plumbing details more in [1] in previous RFC version. You can replace 'devlink slice' with 'devlink port func' in [1]. [1] https://marc.info/?l=linux-netdev&m=158555928517777&w=2