Yes, scheduling was a big gnarly wart that was punted for the first pass. The 
intention was that any driver you put in a single flavor had equivalent 
capabilities/plumbed to the same networks/etc.

doug


> On Feb 1, 2016, at 7:08 AM, Kevin Benton <[email protected]> wrote:
> 
> Hi all,
> 
> I've been working on an implementation of the multiple L3 backends RFE[1] 
> using the flavor framework and I've run into some snags with the use-cases.[2]
> 
> The first use cases are relatively straightforward where the user requests a 
> specific flavor and that request gets dispatched to a driver associated with 
> that flavor via a service profile. However, several of the use-cases are 
> based around the idea that there is a single flavor with multiple drivers and 
> a specific driver will need to be used depending on the placement of the 
> router interfaces. i.e. a router cannot be bound to a driver until an 
> interface is attached.
> 
> This creates some painful coordination problems amongst drivers. For example, 
> say the first two networks that a user attaches a router to can be reached by 
> all drivers because they use overlays so the first driver chosen by the 
> framework works  fine. Then the user connects to an external network which is 
> only reachable by a different driver. Do we immediately reschedule the entire 
> router at that point to the other driver and interrupt the traffic between 
> the first two networks?
> 
> Even if we are fine with a traffic interruption for rescheduling, what should 
> we do when a failure occurs half way through switching over because the new 
> driver fails to attach to one of the networks (or the old driver fails to 
> detach from one)? It would seem the correct API experience would be switch 
> everything back and then return a failure to the caller trying to add an 
> interface. This is where things get messy.
> 
> If there is a failure during the switch back, we now have a single router's 
> resources smeared across two drivers. We can drop the router into the ERROR 
> state and re-attempt the switch in a periodic task, or maybe just leave it 
> broken.
> 
> How should we handle this much orchestration? Should we pull in something 
> like taskflow, or maybe defer that use case for now?
> 
> What I want to avoid is what happened with ML2 where error handling is still 
> a TODO in several cases. (e.g. Any post-commit update or delete failures in 
> mechanism drivers will not trigger a revert in state.)
> 
> 1. https://bugs.launchpad.net/neutron/+bug/1461133 
> <https://bugs.launchpad.net/neutron/+bug/1461133>
> 2. https://etherpad.openstack.org/p/ 
> <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>neutron-modular-l3-router-plugin-use-cases
>  <https://etherpad.openstack.org/p/neutron-modular-l3-router-plugin-use-cases>
> -- 
> Kevin Benton
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to