On Tue, Oct 29, at 4:31 pm, Jiang, Yunhong <yunhong.ji...@intel.com> wrote:
> Henry,why do you think the "service VM" need the entire PF instead of a > VF? I think the SR-IOV NIC should provide QoS and performance isolation. I was speculating. I just thought it might be a good idea to leave open the possibility of assigning a PF to a VM if the need arises. Neutron service VMs are a new thing. I will be following the discussions and there is a summit session for them. It remains to be seen if there is any desire/need for full PF ownership of NICs. But if a service VM owns the PF and has the right NIC driver it could do some advanced features with it. > As to assign entire PCI device to a guest, that should be ok since > usually PF and VF has different device ID, the tricky thing is, at least > for some PCI devices, you can't configure that some NIC will have SR-IOV > enabled while others not. Thanks for the warning. :) Perhaps the cloud admin might plug in an extra NIC in just a few nodes (one or two per rack, maybe) for the purpose of running service VMs there. Again, just speculating. I don't know how hard it is to manage non-homogenous nodes. > > Thanks > --jyh > >> -----Original Message----- >> From: Henry Gessau [mailto:ges...@cisco.com] >> Sent: Tuesday, October 29, 2013 8:10 AM >> To: OpenStack Development Mailing List (not for usage questions) >> Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network >> support >> >> Lots of great info and discussion going on here. >> >> One additional thing I would like to mention is regarding PF and VF usage. >> >> Normally VFs will be assigned to instances, and the PF will either not be >> used at all, or maybe some agent in the host of the compute node might >> have >> access to the PF for something (management?). >> >> There is a neutron design track around the development of "service VMs". >> These are dedicated instances that run neutron services like routers, >> firewalls, etc. It is plausible that a service VM would like to use PCI >> passthrough and get the entire PF. This would allow it to have complete >> control over a physical link, which I think will be wanted in some cases. >> >> -- >> Henry >> >> On Tue, Oct 29, at 10:23 am, Irena Berezovsky <ire...@mellanox.com> >> wrote: >> >> > Hi, >> > >> > I would like to share some details regarding the support provided by >> > Mellanox plugin. It enables networking via SRIOV pass-through devices >> or >> > macvtap interfaces. It plugin is available here: >> > >> https://github.com/openstack/neutron/tree/master/neutron/plugins/mln >> x. >> > >> > To support either PCI pass-through device and macvtap interface type of >> > vNICs, we set neutron port profile:vnic_type according to the required >> VIF >> > type and then use the created port to 'nova boot' the VM. >> > >> > To overcome the missing scheduler awareness for PCI devices which >> was not >> > part of the Havana release yet, we >> > >> > have an additional service (embedded switch Daemon) that runs on each >> > compute node. >> > >> > This service manages the SRIOV resources allocation, answers vNICs >> > discovery queries and applies VLAN/MAC configuration using standard >> Linux >> > APIs (code is here: >> https://github.com/mellanox-openstack/mellanox-eswitchd >> > ). The embedded switch Daemon serves as a glue layer between VIF >> Driver and >> > Neutron Agent. >> > >> > In the Icehouse Release when SRIOV resources allocation is already part >> of >> > the Nova, we plan to eliminate the need in embedded switch daemon >> service. >> > So what is left to figure out is how to tie up between neutron port and >> PCI >> > device and invoke networking configuration. >> > >> > >> > >> > In our case what we have is actually the Hardware VEB that is not >> programmed >> > via either 802.1Qbg or 802.1Qbh, but configured locally by Neutron >> Agent. We >> > also support both Ethernet and InfiniBand physical network L2 >> technology. >> > This means that we apply different configuration commands to set >> > configuration on VF. >> > >> > >> > >> > I guess what we have to figure out is how to support the generic case for >> > the PCI device networking support, for HW VEB, 802.1Qbg and >> 802.1Qbh cases. >> > >> > >> > >> > BR, >> > >> > Irena >> > >> > >> > >> > *From:*Robert Li (baoli) [mailto:ba...@cisco.com] >> > *Sent:* Tuesday, October 29, 2013 3:31 PM >> > *To:* Jiang, Yunhong; Irena Berezovsky; >> prashant.upadhy...@aricent.com; >> > chris.frie...@windriver.com; He, Yongli; Itzik Brown >> > *Cc:* OpenStack Development Mailing List; Brian Bowen (brbowen); >> Kyle >> > Mestery (kmestery); Sandhya Dasu (sadasu) >> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through >> network support >> > >> > >> > >> > Hi Yunhong, >> > >> > >> > >> > I haven't looked at Mellanox in much detail. I think that we'll get more >> > details from Irena down the road. Regarding your question, I can only >> answer >> > based on my experience with Cisco's VM-FEX. In a nutshell: >> > >> > -- a vNIC is connected to an external switch. Once the host is >> booted >> > up, all the PFs and VFs provisioned on the vNIC will be created, as well as >> > all the corresponding ethernet interfaces . >> > >> > -- As far as Neutron is concerned, a neutron port can be >> associated >> > with a VF. One way to do so is to specify this requirement in the -nic >> > option, providing information such as: >> > >> > . PCI alias (this is the same alias as defined in your nova >> > blueprints) >> > >> > . direct pci-passthrough/macvtap >> > >> > . port profileid that is compliant with 802.1Qbh >> > >> > -- similar to how you translate the nova flavor with PCI >> requirements >> > to PCI requests for scheduling purpose, Nova API (the nova api >> component) >> > can translate the above to PCI requests for scheduling purpose. I can >> give >> > more detail later on this. >> > >> > >> > >> > Regarding your last question, since the vNIC is already connected with >> the >> > external switch, the vNIC driver will be responsible for communicating >> the >> > port profile to the external switch. As you have already known, libvirt >> > provides several ways to specify a VM to be booted up with SRIOV. For >> > example, in the following interface definition: >> > >> > >> > >> > *<interface type='hostdev' managed='yes'>* >> > >> > * <source>* >> > >> > * <address type='pci' domain='0' bus='0x09' slot='0x0' >> function='0x01'/>* >> > >> > * </source>* >> > >> > * <mac address='01:23:45:67:89:ab' />* >> > >> > * <virtualport type='802.1Qbh'>* >> > >> > * <parameters profileid='my-port-profile' />* >> > >> > * </virtualport>* >> > >> > * </interface>* >> > >> > >> > >> > The SRIOV VF (bus 0x09, VF 0x01) will be allocated, and the port profile >> 'my-port-profile' will be used to provision this VF. Libvirt will be >> responsible for invoking the vNIC driver to configure this VF with the port >> profile my-port-porfile. The driver will talk to the external switch using >> the >> 802.1qbh standards to complete the VF's configuration and binding with >> the VM. >> > >> > >> > >> > Now that nova PCI passthrough is responsible for >> discovering/scheduling/allocating a VF, the rest of the puzzle is to >> associate >> this PCI device with the feature that's going to use it, and the feature will >> be responsible for configuring it. You can also see from the above example, >> in one implementation of SRIOV, the feature (in this case neutron) may not >> need to do much in terms of working with the external switch, the work is >> actually done by libvirt behind the scene. >> > >> > >> > >> > Now the questions are: >> > >> > -- how the port profile gets defined/managed >> > >> > -- how the port profile gets associated with a neutron network >> > >> > The first question will be specific to the particular product, and >> therefore a particular neutron plugin has to mange that. >> > >> > There may be several approaches to address the second question. For >> example, in the simplest case, a port profile can be associated with a >> neutron network. This has some significant drawbacks. Since the port >> profile defines features for all the ports that use it, the one port profile >> to >> one neutron network mapping would mean all the ports on the network >> will have exactly the same features (for example, QoS characteristics). To >> make it flexible, the binding of a port profile to a port may be done at the >> port creation time. >> > >> > >> > >> > Let me know if the above answered your question. >> > >> > >> > >> > thanks, >> > >> > Robert >> > >> > >> > >> > On 10/29/13 3:03 AM, "Jiang, Yunhong" <yunhong.ji...@intel.com >> > <mailto:yunhong.ji...@intel.com>> wrote: >> > >> > >> > >> > Robert, is it possible to have a IRC meeting? I'd prefer to IRC >> meeting >> > because it's more openstack style and also can keep the minutes >> clearly. >> > >> > >> > >> > To your flow, can you give more detailed example. For example, I >> can >> > consider user specify the instance with -nic option specify a >> network >> > id, and then how nova device the requirement to the PCI device? I >> assume >> > the network id should define the switches that the device can >> connect to >> > , but how is that information translated to the PCI property >> > requirement? Will this translation happen before the nova >> scheduler make >> > host decision? >> > >> > >> > >> > Thanks >> > >> > --jyh >> > >> > >> > >> > *From:*Robert Li (baoli) [mailto:ba...@cisco.com] >> > *Sent:* Monday, October 28, 2013 12:22 PM >> > *To:* Irena Berezovsky; prashant.upadhy...@aricent.com >> > <mailto:prashant.upadhy...@aricent.com>; Jiang, Yunhong; >> > chris.frie...@windriver.com >> <mailto:chris.frie...@windriver.com>; He, >> > Yongli; Itzik Brown >> > *Cc:* OpenStack Development Mailing List; Brian Bowen >> (brbowen); Kyle >> > Mestery (kmestery); Sandhya Dasu (sadasu) >> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through >> network >> > support >> > >> > >> > >> > Hi Irena, >> > >> > >> > >> > Thank you very much for your comments. See inline. >> > >> > >> > >> > --Robert >> > >> > >> > >> > On 10/27/13 3:48 AM, "Irena Berezovsky" <ire...@mellanox.com >> > <mailto:ire...@mellanox.com>> wrote: >> > >> > >> > >> > Hi Robert, >> > >> > Thank you very much for sharing the information regarding >> your >> > efforts. Can you please share your idea of the end to end flow? >> How >> > do you suggest to bind Nova and Neutron? >> > >> > >> > >> > The end to end flow is actually encompassed in the blueprints in a >> > nutshell. I will reiterate it in below. The binding between Nova and >> > Neutron occurs with the neutron v2 API that nova invokes in order >> to >> > provision the neutron services. The vif driver is responsible for >> > plugging in an instance onto the networking setup that neutron has >> > created on the host. >> > >> > >> > >> > Normally, one will invoke "nova boot" api with the -nic options to >> > specify the nic with which the instance will be connected to the >> > network. It currently allows net-id, fixed ip and/or port-id to be >> > specified for the option. However, it doesn't allow one to specify >> > special networking requirements for the instance. Thanks to the >> nova >> > pci-passthrough work, one can specify PCI passthrough device(s) in >> the >> > nova flavor. But it doesn't provide means to tie up these PCI devices >> in >> > the case of ethernet adpators with networking services. Therefore >> the >> > idea is actually simple as indicated by the blueprint titles, to >> > provide >> > means to tie up SRIOV devices with neutron services. A work flow >> would >> > roughly look like this for 'nova boot': >> > >> > >> > >> > -- Specifies networking requirements in the -nic option. >> > Specifically for SRIOV, allow the following to be specified in addition >> > to the existing required information: >> > >> > . PCI alias >> > >> > . direct pci-passthrough/macvtap >> > >> > . port profileid that is compliant with 802.1Qbh >> > >> > >> > >> > The above information is optional. In the absence of them, >> the >> > existing behavior remains. >> > >> > >> > >> > -- if special networking requirements exist, Nova api creates >> PCI >> > requests in the nova instance type for scheduling purpose >> > >> > >> > >> > -- Nova scheduler schedules the instance based on the >> requested >> > flavor plus the PCI requests that are created for networking. >> > >> > >> > >> > -- Nova compute invokes neutron services with PCI >> passthrough >> > information if any >> > >> > >> > >> > -- Neutron performs its normal operations based on the >> request, >> > such as allocating a port, assigning ip addresses, etc. Specific to >> > SRIOV, it should validate the information such as profileid, and >> stores >> > them in its db. It's also possible to associate a port profileid with a >> > neutron network so that port profileid becomes optional in the >> -nic >> > option. Neutron returns nova the port information, especially for >> PCI >> > passthrough related information in the port binding object. >> Currently, >> > the port binding object contains the following information: >> > >> > binding:vif_type >> > >> > binding:host_id >> > >> > binding:profile >> > >> > binding:capabilities >> > >> > >> > >> > -- nova constructs the domain xml and plug in the instance by >> > calling the vif driver. The vif driver can build up the interface xml >> > based on the port binding information. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > The blueprints you registered make sense. On Nova side, there >> is a >> > need to bind between requested virtual network and PCI >> > device/interface to be allocated as vNIC. >> > >> > On the Neutron side, there is a need to support networking >> > configuration of the vNIC. Neutron should be able to identify >> the >> > PCI device/macvtap interface in order to apply configuration. I >> > think it makes sense to provide neutron integration via >> dedicated >> > Modular Layer 2 Mechanism Driver to allow PCI pass-through >> vNIC >> > support along with other networking technologies. >> > >> > >> > >> > I haven't sorted through this yet. A neutron port could be >> associated >> > with a PCI device or not, which is a common feature, IMHO. >> However, a >> > ML2 driver may be needed specific to a particular SRIOV >> technology. >> > >> > >> > >> > >> > >> > During the Havana Release, we introduced Mellanox Neutron >> plugin >> > that enables networking via SRIOV pass-through devices or >> macvtap >> > interfaces. >> > >> > We want to integrate our solution with PCI pass-through Nova >> > support. I will be glad to share more details if you are >> interested. >> > >> > >> > >> > >> > >> > Good to know that you already have a SRIOV implementation. I >> found out >> > some information online about the mlnx plugin, but need more >> time to get >> > to know it better. And certainly I'm interested in knowing its details. >> > >> > >> > >> > The PCI pass-through networking support is planned to be >> discussed >> > during the summit: >> http://summit.openstack.org/cfp/details/129. I >> > think it's worth to drill down into more detailed proposal and >> > present it during the summit, especially since it impacts both >> nova >> > and neutron projects. >> > >> > >> > >> > I agree. Maybe we can steal some time in that discussion. >> > >> > >> > >> > Would you be interested in collaboration on this effort? Would >> you >> > be interested to exchange more emails or set an IRC/WebEx >> meeting >> > during this week before the summit? >> > >> > >> > >> > Sure. If folks want to discuss it before the summit, we can schedule >> a >> > webex later this week. Or otherwise, we can continue the >> discussion with >> > email. >> > >> > >> > >> > >> > >> > >> > >> > Regards, >> > >> > Irena >> > >> > >> > >> > *From:*Robert Li (baoli) [mailto:ba...@cisco.com] >> > *Sent:* Friday, October 25, 2013 11:16 PM >> > *To:* prashant.upadhy...@aricent.com >> > <mailto:prashant.upadhy...@aricent.com>; Irena Berezovsky; >> > yunhong.ji...@intel.com <mailto:yunhong.ji...@intel.com>; >> > chris.frie...@windriver.com >> <mailto:chris.frie...@windriver.com>; >> > yongli...@intel.com <mailto:yongli...@intel.com> >> > *Cc:* OpenStack Development Mailing List; Brian Bowen >> (brbowen); >> > Kyle Mestery (kmestery); Sandhya Dasu (sadasu) >> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI >> pass-through >> > network support >> > >> > >> > >> > Hi Irena, >> > >> > >> > >> > This is Robert Li from Cisco Systems. Recently, I was tasked to >> > investigate such support for Cisco's systems that support >> VM-FEX, >> > which is a SRIOV technology supporting 802-1Qbh. I was able to >> bring >> > up nova instances with SRIOV interfaces, and establish >> networking in >> > between the instances that employes the SRIOV interfaces. >> Certainly, >> > this was accomplished with hacking and some manual >> intervention. >> > Based on this experience and my study with the two existing >> nova >> > pci-passthrough blueprints that have been implemented and >> committed >> > into Havana >> > >> (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base and >> > >> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt), I >> > registered a couple of blueprints (one on Nova side, the other >> on >> > the Neutron side): >> > >> > >> > >> > >> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov >> > >> > >> https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov >> > >> > >> > >> > in order to address SRIOV support in openstack. >> > >> > >> > >> > Please take a look at them and see if they make sense, and let >> me >> > know any comments and questions. We can also discuss this in >> the >> > summit, I suppose. >> > >> > >> > >> > I noticed that there is another thread on this topic, so copy >> those >> > folks from that thread as well. >> > >> > >> > >> > thanks, >> > >> > Robert >> > >> > >> > >> > On 10/16/13 4:32 PM, "Irena Berezovsky" >> <ire...@mellanox.com >> > <mailto:ire...@mellanox.com>> wrote: >> > >> > >> > >> > Hi, >> > >> > As one of the next steps for PCI pass-through I would like >> to >> > discuss is the support for PCI pass-through vNIC. >> > >> > While nova takes care of PCI pass-through device >> resources >> > management and VIF settings, neutron should manage >> their >> > networking configuration. >> > >> > I would like to register asummit proposal to discuss the >> support >> > for PCI pass-through networking. >> > >> > I am not sure what would be the right topic to discuss the >> PCI >> > pass-through networking, since it involve both nova and >> neutron. >> > >> > There is already a session registered by Yongli on nova >> topic to >> > discuss the PCI pass-through next steps. >> > >> > I think PCI pass-through networking is quite a big topic and >> it >> > worth to have a separate discussion. >> > >> > Is there any other people who are interested to discuss it >> and >> > share their thoughts and experience? >> > >> > >> > >> > Regards, >> > >> > Irena >> > >> > >> > >> > >> > >> > _______________________________________________ >> > OpenStack-dev mailing list >> > OpenStack-dev@lists.openstack.org >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev