Thu, Feb 28, 2019 at 05:24:04PM CET, jakub.kicin...@netronome.com wrote: >On Thu, 28 Feb 2019 09:56:24 +0100, Jiri Pirko wrote: >> Wed, Feb 27, 2019 at 07:30:00PM CET, jakub.kicin...@netronome.com wrote: >> >On Wed, 27 Feb 2019 13:37:53 +0100, Jiri Pirko wrote: >> >> Tue, Feb 26, 2019 at 07:24:32PM CET, jakub.kicin...@netronome.com wrote: >> >> >PCI endpoint corresponds to a PCI device, but such device >> >> >can have one more more logical device ports associated with it. >> >> >We need a way to distinguish those. Add a PCI subport in the >> >> >dumps and print the info in phys_port_name appropriately. >> >> > >> >> >This is not equivalent to port splitting, there is no split >> >> >group. It's just a way of representing multiple netdevs on >> >> >a single PCI function. >> >> > >> >> >Note that the quality of being multiport pertains only to >> >> >the PCI function itself. A PF having multiple netdevs does >> >> >not mean that its VFs will also have multiple, or that VFs >> >> >are associated with any particular port of a multiport VF. >> >> >> >> We've been discussing the problem of subport (we call it "subfunction" >> >> or "SF") for some time internally. Turned out, this is probably harder >> >> task to model. Please prove me wrong. >> >> >> >> The nature of VF makes it a logically separate entity. It has a separate >> >> PCI address, it should therefore have a separate devlink instance. >> >> You can pass it through to VM, then the same devlink instance should be >> >> created inside the VM and disappear from the host. >> > >> >Depends what a devlink instance represents :/ On one hand you may want >> >to create an instance for a VF to allow it to spawn soft ports, on the >> >other you may want to group multiple functions together. >> > >> >IOW if devlink instance is for an ASIC, there should be one per device >> >per host. So if we start connecting multiple functions (PFs and/or VFs) >> >to one host we should probably introduce the notion of devlink aliases >> >or some such (so that multiple bus addresses can target the same >> >> Hmm. Like VF address -> PF address alias? That would be confusing to see >> eswitch ports under VF devlink instance... I probably did not get you >> right. > >No eswitch ports under VF, more in case of mutli-PF. Bus addresses of >all PFs aliasing to the same devlink instance.
The multi-PF aliasing makes sense to me. > >> >devlink instance). Those less pipelined NICs can forward between >> >ports, but still want a function per port (otherwise user space >> >sometimes gets confused). If we have multiple functions which are on >> >the same "switchid" they should have a single devlink instance if you >> >ask me. That instance will have all the ports of the device. >> >> Okay, that makes sense. But the question it, can the same devlink >> instance contain ports that does not have "Switchid"? > >No strong preference if switchid is different. To me devlink is an ASIC >instance, if the multiport card is constructed by copy-pasting the same >IP twice onto a die, and the ports really are completely separate, there >is no reason to require single devlink instance. Okay. > >> I think it would be beneficial to have the switchid shown for devlink >> ports too. Then it is clean that the devlink ports with the same >> switchid belong to the same switch, and other ports under the same >> devlink instance (like PF itself) is separate, but still under the same >> ASIC. > >Sure, you mean in terms of UI - user space can do a link dump or get >that from sysfs, right? I thinking about moving it to devlink. I'll work on it more today. > >> >You say disappear from the host - what do you mean. Are you referring >> >to the VF port disappearing? But on the switch the port is still >> >> No, VF itself. eswitch port will be still there on the host. >> >> >> >there, and you should show the subports on the PF side IMHO. Devlink >> >ports should allow users to understand the topology of the switch. >> >> What do you mean by "topology"? > >Mostly which ports are part of the switch and what's their "flavour". >Also (less importantly) which host netdevs are "peers" of eswitch ports. Makes sense. > >> >Is spawning VMDq sub-instances the only thing we can think of that VMs >> >may want to do? Are there any other uses? >> > >> >> SF (or subport) feels similar to that. Basically it is exactly the same >> >> thing as VF, only does reside under PF PCI function. >> >> >> >> That is why I think, for sake of consistency, it should have a separate >> >> devlink entity as well. The problem is correct sysfs modelling and >> >> devlink handle derived from that. Parav is working on a simple soft >> >> bus for this purpose called "subbus". There is a RFC floating around on >> >> Mellanox internal mailing list, looks like it is time to send it >> >> upstream. >> >> >> >> Then each PF driver which have SFs would register subbus devices >> >> according to SFs/subports and they would be properly handled by bus >> >> probe, devlink and devlink port and netdev instances created. >> >> >> >> Ccing Parav and Jason. >> > >> >You guys come from the RDMA side of the world, with which I'm less >> >familiar, and the soft bus + spawning devices seems to be a popular >> >design there. Could you describe the advantages of that model for >> >the sake of the netdev-only folks? :) >> >> I'll try to draw some ascii art :) > >Yess :) > >> >Another term that gets thrown into the mix here is mediated devices, >> >right? If you wanna pass the sub-spawn-soft-port to a VM. Or run >> >DPDK on some queues. >> > >> >To state the obvious AF_XDP and macvlan offload were are previous >> >answers to some of those use cases. What is the forwarding model >> >for those subports? Are we going to allow flower rules from VMs? >> >Is it going to be dst MAC only? Or is the hypervisor going to forward >> >as it sees appropriate (OvS + "repr"/port netdev)?