On Jul 19, 2013, at 1:58 PM, Aaron Rosen <aro...@nicira.com> wrote:
> 
> 
> 
> 
> On Fri, Jul 19, 2013 at 8:47 AM, Kyle Mestery (kmestery) <kmest...@cisco.com> 
> wrote:
> On Jul 18, 2013, at 5:16 PM, Aaron Rosen <aro...@nicira.com> wrote:
> >
> > Hi,
> >
> > I wanted to raise another design failure of why creating the port on 
> > nova-compute is bad. Previously, we have encountered this bug 
> > (https://bugs.launchpad.net/neutron/+bug/1160442). What was causing the 
> > issue was that when nova-compute calls into quantum to create the port; 
> > quantum creates the port but fails to return the port to nova and instead 
> > timesout. When this happens the instance is scheduled to be run on another 
> > compute node where another port is created with the same device_id and when 
> > the instance boots it will look like it has two ports. This is still a 
> > problem that can occur today in our current implementation (!).
> >
> > I think in order to move forward with this we'll need to compromise. Here 
> > is my though on how we should proceed.
> >
> > 1) Modify the quantum API so that mac addresses can now be updated via the 
> > api. There is no reason why we have this limitation (especially once the 
> > patch that uses dhcp_release is merged as it will allow us to update the 
> > lease for the new mac immediately).  We need to do this in order for bare 
> > metal support as we need to match the mac address of the port to the 
> > compute node.
> >
> I don't understand how this relates to creating a port through nova-compute. 
> I'm not saying this is a bad idea, I just don't see how it relates to the 
> original discussion point on this thread around Yong's patch.
> 
> > 2) move the port-creation from nova-compute to nova-api. This will solve a 
> > number of issues like the one i pointed out above.
> >
> This seems like a bad idea. So now a Nova API call will implicitly create a 
> Neutron port? What happens on failure here? The caller isn't aware the port 
> was created in Neutron if it's implicit, so who cleans things up? Or if the 
> caller is aware, than all we've done is move an API the caller would have 
> done (nova-compute in this case) into nova-api, though the caller is now 
> still aware of what's happening.
> 
> On failure here the VM will go to ERROR state if the port is failed to create 
> in quantum. Then when deleting the instance; the delete code should also 
> search quantum for the device_id in order to remove the port there as well.
> 
So, nova-compute will implicitly know the port was created by nova-api, and if 
a failure happens, it will clean up the port? That doesn't sound like a 
balanced solution to me, and seems to tie nova-compute and nova-api close 
together when it comes to launching VMs with Neutron ports.

>  The issue here is that if an instance fails to boot on a compute node 
> (because nova-compute did not get the port-create response from quantum and 
> the port was actually created) the instance gets scheduled to be booted on 
> another nova-compute node where the duplicate create happens. Moving the 
> creation to the API node removes the port from getting created in the retry 
> logic that solves this. 
> 
I think Ian's comments on your blueprint [1] address this exact problem, can 
you take a look at them there?

[1] https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port

> > 3)  For now, i'm okay with leaving logic on the compute node that calls 
> > update-port if the port binding extension is loaded. This will allow the 
> > vif type to be correctly set as well.
> >
> And this will also still pass in the hostname the VM was booted on?
> 
> In this case there would have to be an update-port call done on the compute 
> node which would set the hostname (which is the same case as live migration). 
>  
Just to be sure I understand, nova-compute will do this or this will be the 
responsibility of some neutron agent?

Thanks,
Kyle

> To me, this thread seems to have diverged a bit from the original discussion 
> point around Yong's patch. Yong's patch makes sense, because it's passing the 
> hostname the VM is booted on during port create. It also updates the binding 
> during a live migration, so that case is covered. Any change to this behavior 
> should cover both those cases and not involve any sort of agent polling, IMHO.
> 
> Thanks,
> Kyle
> 
> > Thoughts/Comments?
> >
> > Thanks,
> >
> > Aaron
> >
> >
> > On Mon, Jul 15, 2013 at 2:45 PM, Aaron Rosen <aro...@nicira.com> wrote:
> >
> >
> >
> > On Mon, Jul 15, 2013 at 1:26 PM, Robert Kukura <rkuk...@redhat.com> wrote:
> > On 07/15/2013 03:54 PM, Aaron Rosen wrote:
> > >
> > >
> > >
> > > On Sun, Jul 14, 2013 at 6:48 PM, Robert Kukura <rkuk...@redhat.com
> > > <mailto:rkuk...@redhat.com>> wrote:
> > >
> > >     On 07/12/2013 04:17 PM, Aaron Rosen wrote:
> > >     > Hi,
> > >     >
> > >     >
> > >     > On Fri, Jul 12, 2013 at 6:47 AM, Robert Kukura <rkuk...@redhat.com
> > >     <mailto:rkuk...@redhat.com>
> > >     > <mailto:rkuk...@redhat.com <mailto:rkuk...@redhat.com>>> wrote:
> > >     >
> > >     >     On 07/11/2013 04:30 PM, Aaron Rosen wrote:
> > >     >     > Hi,
> > >     >     >
> > >     >     > I think we should revert this patch that was added here
> > >     >     > (https://review.openstack.org/#/c/29767/). What this patch
> > >     does is
> > >     >     when
> > >     >     > nova-compute calls into quantum to create the port it passes
> > >     in the
> > >     >     > hostname on which the instance was booted on. The idea of the
> > >     >     patch was
> > >     >     > that providing this information would "allow hardware device
> > >     vendors
> > >     >     > management stations to allow them to segment the network in
> > >     a more
> > >     >     > precise manager (for example automatically trunk the vlan on 
> > > the
> > >     >     > physical switch port connected to the compute node on which
> > >     the vm
> > >     >     > instance was started)."
> > >     >     >
> > >     >     > In my opinion I don't think this is the right approach.
> > >     There are
> > >     >     > several other ways to get this information of where a
> > >     specific port
> > >     >     > lives. For example, in the OVS plugin case the agent running
> > >     on the
> > >     >     > nova-compute node can update the port in quantum to provide 
> > > this
> > >     >     > information. Alternatively, quantum could query nova using the
> > >     >     > port.device_id to determine which server the instance is on.
> > >     >     >
> > >     >     > My motivation for removing this code is I now have the free
> > >     cycles to
> > >     >     > work on
> > >     >     >
> > >     >
> > >     
> > > https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port
> > >     >     >  discussed here
> > >     >     >
> > >     >
> > >     
> > > (http://lists.openstack.org/pipermail/openstack-dev/2013-May/009088.html)
> > >     >      .
> > >     >     > This was about moving the quantum port creation from the
> > >     nova-compute
> > >     >     > host to nova-api if a network-uuid is passed in. This will
> > >     allow us to
> > >     >     > remove all the quantum logic from the nova-compute nodes and
> > >     >     > simplify orchestration.
> > >     >     >
> > >     >     > Thoughts?
> > >     >
> > >     >     Aaron,
> > >     >
> > >     >     The ml2-portbinding BP I am currently working on depends on
> > >     nova setting
> > >     >     the binding:host_id attribute on a port before accessing
> > >     >     binding:vif_type. The ml2 plugin's MechanismDrivers will use the
> > >     >     binding:host_id with the agents_db info to see what (if any)
> > >     L2 agent is
> > >     >     running on that host, or what other networking mechanisms
> > >     might provide
> > >     >     connectivity for that host. Based on this, the port's
> > >     binding:vif_type
> > >     >     will be set to the appropriate type for that agent/mechanism.
> > >     >
> > >     >     When an L2 agent is involved, the associated ml2
> > >     MechanismDriver will
> > >     >     use the agent's interface or bridge mapping info to determine
> > >     whether
> > >     >     the agent on that host can connect to any of the port's 
> > > network's
> > >     >     segments, and select the specific segment (network_type,
> > >     >     physical_network, segmentation_id) to be used. If there is no
> > >     >     connectivity possible on the host (due to either no L2 agent
> > >     or other
> > >     >     applicable mechanism, or no mapping for any of the network's
> > >     segment's
> > >     >     physical_networks), the ml2 plugin will set the binding:vif_type
> > >     >     attribute to BINDING_FAILED. Nova will then be able to
> > >     gracefully put
> > >     >     the instance into an error state rather than have the instance
> > >     boot
> > >     >     without the required connectivity.
> > >     >
> > >     >     I don't see any problem with nova creating the port before
> > >     scheduling it
> > >     >     to a specific host, but the binding:host_id needs to be set
> > >     before the
> > >     >     binding:vif_type attribute is accessed. Note that the host
> > >     needs to be
> > >     >     determined before the vif_type can be determined, so it is not
> > >     possible
> > >     >     to rely on the agent discovering the VIF, which can't be
> > >     created until
> > >     >     the vif_type is determined.
> > >     >
> > >     >
> > >     > So what your saying is the current workflow is this: nova-compute
> > >     > creates a port in quantum passing in the host-id (which is the
> > >     hostname
> > >     > of the compute host). Now quantum looks in the agent table in it's
> > >     > database to determine the VIF type that should be used based on the
> > >     > agent that is running on the nova-compute node?
> > >
> > >     Most plugins just return a hard-wired value for binding:vif_type. The
> > >     ml2 plugin supports heterogeneous deployments, and therefore needs 
> > > more
> > >     flexibility, so this is whats being implemented in the agent-based ml2
> > >     mechanism drivers. Other mechanism drivers (i.e. controller-based) 
> > > would
> > >     work differently. In addition to VIF type selection, port binding in 
> > > ml2
> > >     also involves determining if connectivity is possible, and selecting 
> > > the
> > >     network segment to use, and these are also based on binding:host_id.
> > >
> > >
> > > Can you go into more details about what you mean by heterogeneous
> > > deployments (i.e what the topology looks like)? Why would connectivity
> > > not be possible? I'm confused why things would be configured in such a
> > > way where the scheduler wants to launch an instance on a node where
> > > quantum is not able to provide connectivity for.
> >
> > By heterogeneous deployment, I meant that all compute nodes are not
> > necessarily identically configured. Some might be running the
> > openvswitch agent, some the linuxbridge agent, and some the hyperv
> > agent, but all able to access VLANs on (some of) the same trunks.
> >
> > One example of connectivity not being possible would be if multiple VLAN
> > trunks are in use in the datacenter, but not all compute nodes have
> > connections to every trunk.
> >
> > I agree the scheduler should ensure connectivity will be possible. But
> > mechanisms such as cells, zones, and flavors can also be used in nova to
> > manage heterogeneity. The ml2 port binding code should ideally never
> > find out the scheduled node does not have connectivity, but we've at
> > least defined what should happen if it does. The main need here though
> > is for the port binding code to select the segment to use.
> >
> > Why does the port binding code select which segment to use? I'm unclear why 
> > anyone would ever have a deployment with a mix of vlans where things are 
> > trunked in some places and not in others and neutron would have to keep up 
> > with that. The part i'm unclear on is how neutron would be expected to 
> > behave in this type of setup. Say one boots several instances: instance1 
> > lands on compute1 and neutron puts it on vlan X. Later instance 2 is booted 
> > and it lands on compute2 on this node vlan X isn't reachable?
> >
> >
> >
> > >
> > >
> > >
> > >     >                                                  My question would
> > >     be why
> > >     > the nova-compute node doesn't already know which VIF_TYPE it should 
> > > be
> > >     > using?
> > >
> > >     I guess the thinking was that this knowledge belonged in quantum 
> > > rather
> > >     than nova, and thus the GenericVifDriver was introduced in grizzly. 
> > > See
> > >     https://blueprints.launchpad.net/nova/+spec/libvirt-vif-driver and
> > >     https://blueprints.launchpad.net/neutron/+spec/.
> > >     vif-plugging-improvements
> > >     
> > > <https://blueprints.launchpad.net/neutron/+spec/vif-plugging-improvements>.
> > >
> > >
> > > Thanks for the links. It seems like the the motivation for this was to
> > > remove the libvirt vif configuration settings from nova and off load
> > > that to quantum via the vif_type param on a port. It seems like when
> > > using a specific plugin that plugin will always returns the same
> > > vif_type to a given node. This configuration option in my opinion looks
> > > best to be handled as part of your deployment automation instead and not
> > > baked into quantum ports.
> >
> > For monolithic plugins, returning a fixed vif_type works, but this is
> > not sufficient for ml2.
> >
> > I was happy with the old approach of configuring drivers in nova (via
> > deployment automation ideally), but the decision was made in grizzly to
> > switch to the GenericVifDriver.
> >
> > >
> > > My goal is to reduce the orchestration and complexity between nova and
> > > quantum. Currently, nova-api and nova-compute both call out to quantum
> > > when all of this could be done on the api node (ignoring bare metal for
> > > now as in this case we'd need to do something special to handle updating
> > > the mac addresses on those logical ports in quantum).
> >
> > Sounds like the scheduler is going to need to call neutron as well, at
> > least in some cases.
> >
> > Why is this? The only use case I see so far for something other than 
> > nova-api to call into neutron would be bare metal. I think having neutron 
> > telling nova which vif type it should be using is really tightly coupling 
> > nova+quantum integration. I think we should probably reexamine 
> > https://blueprints.launchpad.net/nova/+spec/libvirt-vif-driver as setting 
> > the libvirt_type from the neutron side seems to be something that the 
> > sysadmin should configure once and not have to rely on neutron to specify.
> >
> > Thanks,
> >
> > Aaron
> >
> > -Bob
> >
> > >
> > >
> > >     -Bob
> > >
> > >     >
> > >     >
> > >     >     Back when the port binding extension was originally being
> > >     hashed out, I
> > >     >     had suggested using an explicit bind() operation on port that
> > >     took the
> > >     >     host_id as a parameter and returned the vif_type as a result.
> > >     But the
> > >     >     current attribute-based approach was chosen instead. We could
> > >     consider
> > >     >     adding a bind() operation for the next neutron API revision,
> > >     but I don't
> > >     >     see any reason the current attribute-based binding approach
> > >     cannot work
> > >     >     for now.
> > >     >
> > >     >     -Bob
> > >     >
> > >     >     >
> > >     >     > Best,
> > >     >     >
> > >     >     > Aaron
> > >     >     >
> > >     >     >
> > >     >     > _______________________________________________
> > >     >     > OpenStack-dev mailing list
> > >     >     > OpenStack-dev@lists.openstack.org
> > >     <mailto:OpenStack-dev@lists.openstack.org>
> > >     >     <mailto:OpenStack-dev@lists.openstack.org
> > >     <mailto:OpenStack-dev@lists.openstack.org>>
> > >     >     >
> > >     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >     >     >
> > >     >
> > >     >
> > >     >
> > >     >
> > >     > _______________________________________________
> > >     > OpenStack-dev mailing list
> > >     > OpenStack-dev@lists.openstack.org
> > >     <mailto:OpenStack-dev@lists.openstack.org>
> > >     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >     >
> > >
> > >
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to