Hi Jakub, > -----Original Message----- > From: netdev-ow...@vger.kernel.org <netdev-ow...@vger.kernel.org> On > Behalf Of Jakub Kicinski > Sent: Thursday, March 14, 2019 5:10 PM > To: Jiri Pirko <j...@resnulli.us> > Cc: da...@davemloft.net; netdev@vger.kernel.org; oss- > driv...@netronome.com > Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI > ports > > On Thu, 14 Mar 2019 08:38:40 +0100, Jiri Pirko wrote: > > Wed, Mar 13, 2019 at 05:55:55PM CET, jakub.kicin...@netronome.com > wrote: > > >On Wed, 13 Mar 2019 17:22:43 +0100, Jiri Pirko wrote: > > >> Wed, Mar 13, 2019 at 05:17:31PM CET, jakub.kicin...@netronome.com > wrote: > > >> >On Wed, 13 Mar 2019 07:07:01 +0100, Jiri Pirko wrote: > > >> >> Tue, Mar 12, 2019 at 09:56:28PM CET, > jakub.kicin...@netronome.com wrote: > > >> >> >On Tue, 12 Mar 2019 15:02:39 +0100, Jiri Pirko wrote: > > >> >> >> Tue, Mar 12, 2019 at 03:10:54AM CET, wrote: > > >> >> >> >On Mon, 11 Mar 2019 09:52:04 +0100, Jiri Pirko wrote: > > >> >> >> >> Fri, Mar 08, 2019 at 08:09:43PM CET, wrote: > > >> >> >> >> >If the switchport is in the hypervisor then only the hypervisor > can > > >> >> >> >> >control switching/forwarding, correct? > > >> >> >> >> > > >> >> >> >> Correct. > > >> >> >> >> > > >> >> >> >> >The primary use case for partitioning within a VM (of a VF) > would be > > >> >> >> >> >containers (and DPDK)? > > >> >> >> >> > > >> >> >> >> Makes sense. > > >> >> >> >> > > >> >> >> >> >SR-IOV makes things harder. Splitting a PF is reasonably easy > to grasp. > > >> >> >> >> >I'm trying to get a sense of is how would we control an SR-IOV > > >> >> >> >> >environment as a whole. > > >> >> >> >> > > >> >> >> >> You mean orchestration? > > >> >> >> > > > >> >> >> >Right, orchestration. > > >> >> >> > > > >> >> >> >To be clear on where I'm going with this - if we want to > > >> >> >> >allow VFs to partition themselves then they have to control what > is effectively > > >> >> >> >a "nested" switch. A per-VF set of rules which would the get > > >> >> >> > > >> >> >> Wait. If you allow to make VF subports (I believe that is > > >> >> >> what you ment by VFs partition themselves), that does not mean > they will have a > > >> >> >> separate nested switch. They would still belong under the same > one. > > >> >> > > > >> >> >But that existing switch is administered by the hypervisor, how > would > > >> >> >the VF owners install forwarding rules in a switch they don't > control? > > >> >> > > >> >> They won't. > > >> > > > >> >Argh. So how is forwarding configured if there are no rules? Are > > >> >you going to assume its switching on MACs? We're supposed to > > >> >offload software constructs. If its a software port it needs to > > >> >be explicitly switched. If it's not explicitly switched - we already > > >> >have > macvlan > > >> >offload. > > >> > > >> Wait a second. You configure the switch. And for that, you have the > > >> switchports (representors). What we are talking about are VF (VF > > >> subport) host legs. Am I missing something? > > > > > >Hm :) So when VM gets a new port, how is it connected? Are we > > >assuming all ports of a VM are plugged into one big L2 switch? > > >The use case for those sub ports is a little murky, sorry about the > > >endless confusion :) > > > > Np. When user John (on baremetal, or whenever the devlink instance > > with switch port is) creates VF of VF subport by: > > $ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0 or by: > > $ devlink dev port add pci/0000:05:00.0 flavour pci_vf pf 0 vf 0 > > > > Then instances of flavour pci_vf are going to appear in the same > > devlink instance. Those are the switch ports: > > pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 > > flavour pci_vf pf 0 vf 0 > > switch_id 00154d130d2f peer pci/0000:05:10.1/0 > > pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 > > flavour pci_vf pf 0 vf 0 subport 1 > > switch_id 00154d130d2f peer pci/0000:05:10.1/1 > > > > With that, peers are going to appear too, and those are the actual > > VF/VF > > subport: > > pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host > > peer pci/0000:05:00.0/10002 > > pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host > > peer pci/0000:05:00.0/10003 > > > > Later you can push this VF along with all subports to VM. So in VM, > > you are going to see the VF like this: > > $ devlink dev > > pci/0000:00:08.0 > > $ devlink port > > pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host > > pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host > > > > And back to your question of how are they connected in eswitch. > > That is totally up to the original user John who did the creation. > > He is in charge of the eswitch on baremetal, he would configure the > > forwarding however he likes. > > Ack, so I think you're saying VM has to communicate to the cloud > environment to have this provisioned using some service API, not a kernel > API. That's what I wanted to confirm. > > I don't see any benefit to having the "host ports" under devlink, as such I > think it's a matter of preference. We need 'host ports' to configure parameters of this host port which is not exposed by the rep-netdev. Such as mac address.
> I'll try to describe the two options to > Netronome's FAEs and see which one they find more intuitive. > > Makes sense?