Mike, this has been really fun, but it is starting to feel like a rabbit hole.
The case for having one feels legitimate. However, at this point, I think someone will need to actually build it, or the idea is just a pipe dream. Excerpts from Mike Spreitzer's message of 2013-09-30 19:21:22 -0700: > OK, let's take the holistic infrastructure scheduling out of Heat. It > really belongs at a lower level anyway. Think of it as something you slap > on top of Nova, Cinder, Neutron, etc. and everything that is going to use > them goes first through the holistic scheduler, to give it a chance to > make some joint decisions. Zane has been worried about conflicting > decisions being made, but if everything goes through the holistic > infrastructure scheduling service then there does not need to be an issue > with other parallel decision-making services (more on this below). For a > public cloud, think of this holistic infrastructure scheduling as part of > the service that the cloud offers to the public; the public says what it > wants, and the various levels of schedulers work on delivering it; the > internals are not exposed to the public. For example, a cloud user may > say "spread my cluster across at least two racks, not too unevenly"; you > do not want that public cloud customer to be in the business of knowing > how many racks are in the cloud, knowing how much each one is currently > being used, and picking which rack will contain which members of his > cluster. For a private cloud, the holistic infrastructure scheduler > should have the same humility as the lower schedulers: offer enough > visibility and control to the clients that they can make decisions if they > want to (thus, nobody needs to "go around" the holistic infrastructure > scheduler if they already know what they want). > > You do not want to ask the holistic infrastructure scheduler to schedule > resources one by one; you want to ask it to allocate a whole > pattern/template/topology. There is thus no need for infrastructure > orchestration prior to holistic infrastructure scheduling. > > Once the holistic infrastructure scheduler has done its job, there is a > need for infrastructure orchestration. What should we use for that? > > OK, more on the business of conflicting decisions. For the sake of > scalability and modularity, the holistic infrastructure scheduler should > delegate as much decision-making as it can to more specific services. The > job of the holistic infrastructure scheduler is to make joint decisions > when there are strong interactions between services. You can fudge this > either way (have the holistic infrastructure scheduler make more or less > decisions than ideal), but if you want the best then I think the principle > I stated is what would guide. So what if a delegated decision conflicts > with a holistic decision? Don't do that. Divide the decision-making > responsibilities into distinct domains, for example with the holistic > scheduler making relatively big-picture decisions and individual resource > services filling in the details. > > That said, there can still be nasty surprises from lower layers. Even if > the design has carefully partitioned decision-making responsibilities, > irregular things can still happen (e.g., authorized people can do > something unexpected). Even if nothing intentionally does anything > irregular, there remains the possibility of bugs. The holistic > infrastructure scheduler should be prepared for nasty surprises, and > getting information that is as authoritative as possible to begin with > (promptness doesn't hurt either). > > Then there is the question of the scalability of the holistic > infrastructure scheduler. One hard kernel of that is solving the > optimization problem. Nobody should expect the scheduler to find the > truly optimal solution; this is an NP-hard problem. However, there exist > optimization algorithms that produce pretty good approximations in modest > amounts of time. Additionally: if the patterns are small relative to the > size of the whole zone being scheduled then it should be possible to do > concurrent decision-making with optimistic concurrency control (as Clint > has mentioned). > > You would not want one holistic infrastructure scheduler for a whole > geographically distributed cloud. You could use a hierarchical > arrangement, with one top-level decision-maker dividing a pattern between > availability zones (by which I mean the sort of large independent domains > that are typically known by that term) and then a subsidiary scheduler for > each availability zone. > > Regards, > Mike _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev