Hi Zane Thank you for your very valuable post. We should convert your suggest to multiple bps.
2014-04-07 17:28 GMT-07:00 Zane Bitter <zbit...@redhat.com>: > The Neutron API is a constant cause of pain for us as Heat developers, but > afaik we've never attempted to bring up the issues we have found in a > cross-project forum. I've recently been doing some more investigation and I > want to document the exact ways in which the current Neutron API breaks > orchestration, both in the hope that a future version of it might be better > and as a guide for other API authors. > > BTW it's my contention that an API that is bad for orchestration is also > hard to use for the ordinary user as well. When you're trying to figure out > the order of operations you need to do, there are two times at which you > could find out you've got it wrong: > > 1) Before you run the command, when you realise you don't have all of the > required data yet; or > 2) After you run the command, when you get a cryptic error message. > > Not only is (1) *mandatory* for a data-driven orchestration system like > Heat, it offers orders-of-magnitude better user experience for everyone. > > I should say at the outset that I know next to nothing about Neutron, and > one of the goals of this message is to find out which parts I am completely > wrong about. I did know a little bit about traditional networking at one > time, and even remember some of it ;) > > > Neutron has a little documentation on workflow, so let's begin there: > http://docs.openstack.org/api/openstack-network/2.0/content/Overview-d1e71.html#Theory > > (1) Create a network > Instinctively, I want a Network to be something like a virtual VRF (VVRF?): > a separate namespace with it's own route table, within which subnet prefixes > are not overlapping, but which is completely independent of other Networks > that may contain overlapping subnets. As far as I can tell, this basically > seems to be the case. The difference, of course, is that instead of having > to configure a VRF on every switch/router and make sure they're all in sync > and connected up in the right ways, I just define it in one place globally > and Neutron does the rest. I call this #winning. Nice work, Neutron. In Neutron, "A network is an isolated virtual layer-2 broadcast domain" http://docs.openstack.org/api/openstack-network/2.0/content/Overview-d1e71.html#subnet so the model don't have any L3 stuffs. > (2) Associate a subnet with the network > Slightly odd choice of words, because you're actually creating a new Subnet > (there's no such thing as a Subnet not associated with a Network), but this > is probably just a minor documentation nit. Instinctively, I want a Subnet > to be something like a virtual VLAN (VVLAN?): at its most basic level, just > a group of ports that share a broadcast domain, but also having other > properties (e.g. if L3 is in use, all IP addresses in the subnet should be > in the same CIDR). This doesn't seem to be the case, though, it's just a > CIDR prefix, which leaves me wondering how L2 traffic will be treated, as > well as how I would do things like use both IPv4 and IPv6 on a single port > (by assigning a port to multiple Subnets?). Looking at the docs, there is a > much bigger emphasis on DHCP client settings than I expected - surely I > might want to want to give two sets of ports in the same Subnet different > DHCP configs? Still, this is not bad - the DHCP configuration is done by the > time the Subnet is created, so there's no problem in connecting stuff to it > immediately after. so, subnet has many meanings. In neutron, it means "A subnet represents an IP address block that can be used to assign IP addresses to virtual instances." http://docs.openstack.org/api/openstack-network/2.0/content/Overview-d1e71.html#subnet so "subnet" in your definition is more like "network" in neutron. > > (3) Boot a VM and attach it to the network > Here's where you completely lost me. I just created a Subnet - maybe a bunch > of Subnets. I don't want to attach my VM just anywhere in the *Network*, I > want to attach it to a *particular* Subnet. It's not at all obvious where my > instance will get attached (at random?), because this API just plain takes > the Wrong data type. As a user, I'm irritated and confused. +1 for specifying subnet on booting server. We should have a bp in nova side and neutron. > The situation for orchestration, though, is much, much worse. Because the > server takes a reference to a network, the dependency graph generated from > my template will look like this: > > Network <---------- Subnet > ^ > \ > ------------ Server > > And yet if the Server is created before the Subnet (as will happen ~50% of > the time), it will fail. And vice-versa on delete, where the server must be > removed before the subnet. The dependency graph we needed to create was > this: > > Network <---------- Subnet <---------- Server > > The solution used here was to jury-rig the resource types in Heat with a > hidden dependency. We can't know which Subnet the server will end up > attached to, so we create hidden dependencies on all of the ones defined in > the same template. There's nothing we can do about Subnets defined in > different templates (Heat allows a tree of templates to be instantiated with > a single command) - I'm not sure, but it may be possible even now to create > a tree of stacks that in practice could never be successfully deleted. > > The Neutron models in Heat are so riddled with these kinds of invisible > special-case hacks that all of our public documentation about how Heat can > be expected to respond to a particular template is rendered effectively > meaningless with respect to Neutron. > > I should add that we can't blame Nova here, because explicitly creating a > Port doesn't help - it too takes only a network argument, despite > _requiring_ a Subnet that it will be attached to, presumably at random. We can create a port with subnet_id, so this looks like doc bug for this part. https://gist.github.com/nati/10080957 In > fact using a Port makes things even worse, because although there is an API > for it Nova and Neutron seem to assume that nobody would ever use it, and > therefore even if you create a port explicitly and pass it to Nova to > connect a Server, when you disconnect the Server again the Port will be > deleted at the same time as if you had let Nova create it implicitly for > you. This issue is currently breaking stack updates because we tend to > assume that once we've explicitly created something, it stays created. I agree the use case for this. We should have a bp for prevent auto deletion of port. # However, an api user can also delete anything, so we can't assume anything stays created in the other hand. But this is a different topic. > Evidently there is a mechanism for associating a Port with a Subnet, and > that's by assigning a fixed IP - which is hardly ever what I want. There's > no middle ground that I can find between specifying the exact, fixed IP for > a port and just letting it end up somewhere - anywhere - on the network, > entirely at random. We should have a bp for this. > > Let's move on to the L3 extension, starting with Routers. There's kind of an > inconsistency here, because Routers are virtual devices that I need to > manage. Hitherto, the point of Neutron was to free me from managing > individual devices and let me manage the network as a whole. Is there a > reason I wouldn't want all of the Subnets in the Network to just do the > Right Thing and make sure everywhere is reachable efficiently from > everywhere else? If I want something separate, wouldn't I use a different > Network? (It's not like I have any control over where in a Network ports get > attached anyway.) We are working on this issue in Group based policy. If we could implement this, there is no need to think about virtual device any more. > Nonetheless, Routers exist and it appears I have to create one to route > packets between Subnets. From an orchestration perspective, I'd like Router > to take a list of Ports to attach to (and of course I'd like each Port to be > explicitly associated with a Subnet!). I'd be out of luck though, because > even though the Port list is a property of a Router, you can't set it at > creation time, only through an update. This is by definition possible to do > at creation time (if I can do a create call immediately followed by an > update call then the Neutron API can certainly do this internally), so it's > very strange to see it disallowed. Following this API led us to implement it > wrong in Heat as well, leading to headaches with floating IPs, about which > more later. We also mistakenly used a similar design for the Router's > external gateway, but later corrected it by making it a property of the > Router, as it is in the API (though we still have to live with a lengthy > deprecation period). We'll probably end up doing the same with the > interfaces. +1 for have a bp for specifying subnet in router creation. > Of course it goes without saying that the router gateway is just a reference > to another network and, once again, requires a hidden dependency on all of > the Subnets in the hopes of picking up the right one. BTW I'm just assuming > that the definition of the gateway is "interface to another Network over > which I will do NAT"? I assume that because of the generic way in which > Floating IPs are handled, with a reference to an external network (I guess > the operator provides the user with the Network UUID for the Internet?) It's > not exactly clear why the external gateway is special enough that you can > have only one interface of this type on a Router, but not so special that it > would be considered a separate thing. There is also a separate Network > Gateway, and I have no idea what that is... > > The big problem with Floating IPs is that you can't create them until all > the necessary hops in the internetwork have been set up. And, once again, > there's nothing in the creation parameters that would naturally order them - > you just pass a reference to the external network. We still have a bug open > on this, but what we will have to do is create a hidden dependency on any > RouterInterfaces that connect any Routers whose external gateway is the same > network where the floating IP is allocated. That's about as horrible as it > sounds. A Floating IP needs to take as an argument a reference to the > Router/Gateway which does the NAT: > > External External > Network <---- Subnet <---- (gateway) > \ > Router <---- Floating IP > Internal / / > Network <---- Subnet <------<---- Port <---- Sorry, I couldn't get point here. I agree it is not flexible. so what's model you will be happy on here? > The bane of my existence during Icehouse development has been the > ExtraRoutes table. First off, this is broken in a way completely unrelated > to orchestration: you can't add, remove or change an entry in the table > without rewriting the whole table, so the whole API is a giant race > condition waiting to happen. (This can, and IMHO should, be fixed - at least > for those using the official client - with an ETags header and the 409 > return code.) Everything about this API, though, is strange. It's another > one of those only-on-update properties of a Router, though in this case > that's forced by the fact that you can't attach the Router to its Subnets > during its creation. An extra route doesn't behave at all like a static RIB > entry (with a weight and an administrative distance), but much like a FIB > entry (i.e. it's for routes that have already been selected to be active). > That part makes sense, but the next hop for a FIB entry is a layer 2 address > and this takes an IP address. That makes no sense to me, since the IP > address(es) assigned to the nexthop play no part in how packets are > forwarded. And, of course, it creates massive dependency issues, because we > don't know which ports are going to end up with the IP addresses required. > This API should take a reference to a Port as the nexthop. I've been told we > can't even simulate this in Heat at the moment because a VPN connection > doesn't have a port associated with it. (If the API accepted _either_ a Port > or a VPN connection, that would be fine by me though.) So far we've been > unable to merge ExtraRoutes into Heat, except for a plugin in /contrib, for > want of a way to make this reliably work in the correct dependency order > without resorting to progressively worse hacks. Sorry, I proposed the extension. so please blame me. (hopefully, softly.. :) ) That was a small first step. We expect a limited use case on the first usecase. This was implemented as a one attribute in the router, so something like security group list in port, we need to send everything for update. Fortunately, we have a bp proposed for this. https://blueprints.launchpad.net/neutron/+spec/extended-route-params > I'm sure fresh horrors await in corners I have not yet dug into. I must say > that the VPN Service, happily, is one that seems to have done things right. > Firewall looks pretty good in itself, although the fact that it is > completely disjoint from any other configuration - i.e. you can't even > specify which network it applies to, let alone which gateway - is > incomprehensible. > > > Over the past couple of development cycles, we've seen a number of proposals > to push orchestration-like features into Neutron itself. It is now clear to > me why: because the Neutron API is illegible to external orchestration > tools, this leads to people wanting to do an end run around it. > > I don't expect that the current API can be fixed without breaking backwards > compatibility, but I hope that folks will take these concepts into account > the next time the Neutron API gets revised. (I also hope we won't see any > more proposals to effectively reimplement Heat behind the Neutron API ;) > Please fell free to include [Heat] in any discussion along those lines, we'd > be happy to give feedback on any given API designs. In exchange, if any > Neutron folks are able to explain the exact ways in which my ideas about how > the current Neutron API does and/or should work are wrong and/or crazy, I > would be most appreciative :) Best Nachi > > cheers, > Zane. > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev