Yes, scheduling was a big gnarly wart that was punted for the first pass. The intention was that any driver you put in a single flavor had equivalent capabilities/plumbed to the same networks/etc.
doug > On Feb 1, 2016, at 7:08 AM, Kevin Benton <[email protected]> wrote: > > Hi all, > > I've been working on an implementation of the multiple L3 backends RFE[1] > using the flavor framework and I've run into some snags with the use-cases.[2] > > The first use cases are relatively straightforward where the user requests a > specific flavor and that request gets dispatched to a driver associated with > that flavor via a service profile. However, several of the use-cases are > based around the idea that there is a single flavor with multiple drivers and > a specific driver will need to be used depending on the placement of the > router interfaces. i.e. a router cannot be bound to a driver until an > interface is attached. > > This creates some painful coordination problems amongst drivers. For example, > say the first two networks that a user attaches a router to can be reached by > all drivers because they use overlays so the first driver chosen by the > framework works fine. Then the user connects to an external network which is > only reachable by a different driver. Do we immediately reschedule the entire > router at that point to the other driver and interrupt the traffic between > the first two networks? > > Even if we are fine with a traffic interruption for rescheduling, what should > we do when a failure occurs half way through switching over because the new > driver fails to attach to one of the networks (or the old driver fails to > detach from one)? It would seem the correct API experience would be switch > everything back and then return a failure to the caller trying to add an > interface. This is where things get messy. > > If there is a failure during the switch back, we now have a single router's > resources smeared across two drivers. We can drop the router into the ERROR > state and re-attempt the switch in a periodic task, or maybe just leave it > broken. > > How should we handle this much orchestration? Should we pull in something > like taskflow, or maybe defer that use case for now? > > What I want to avoid is what happened with ML2 where error handling is still > a TODO in several cases. (e.g. Any post-commit update or delete failures in > mechanism drivers will not trigger a revert in state.) > > 1. https://bugs.launchpad.net/neutron/+bug/1461133 > <https://bugs.launchpad.net/neutron/+bug/1461133> > 2. https://etherpad.openstack.org/p/ > <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>neutron-modular-l3-router-plugin-use-cases > <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases> > -- > Kevin Benton > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
