On 24/09/13 05:31, Mike Spreitzer wrote:
I was not trying to raise issues of geographic dispersion and other
higher level structures, I think the issues I am trying to raise are
relevant even without them.  This is not to deny the importance, or
relevance, of higher levels of structure.  But I would like to first
respond to the discussion that I think is relevant even without them.

I think it is valuable for OpenStack to have a place for holistic
infrastructure scheduling.  I am not the only one to argue for this, but
I will give some use cases.  Consider Hadoop, which stresses the path
between Compute and Block Storage.  In the usual way of deploying and
configuring Hadoop, you want each data node to be using directly
attached storage.  You could address this by scheduling one of those two
services first, and then the second with constraints from the first ---
but the decisions made by the first could paint the second into a
corner.  It is better to be able to schedule both jointly.  Also
consider another approach to Hadoop, in which the block storage is
provided by a bank of storage appliances that is equidistant (in
networking terms) from all the Compute.  In this case the Storage and
Compute scheduling decisions have no strong interaction --- but the
Compute scheduling can interact with the network (you do not want to
place Compute in a way that overloads part of the network).

Thanks for writing this up, it's very helpful for figuring out what you mean by a 'holistic' scheduler.

I don't yet see how this could be considered in-scope for the Orchestration program, which uses only the public APIs of other services.

To take the first example, wouldn't your holistic scheduler effectively have to reserve a compute instance and some directly attached block storage prior to actually creating them? Have you considered Climate rather than Heat as an integration point?

Once a holistic infrastructure scheduler has made its decisions, there
is then a need for infrastructure orchestration.  The infrastructure
orchestration function is logically downstream from holistic scheduling.

I agree that it's necessarily 'downstream' (in the sense of happening afterwards). I'd hesitate to use the word 'logically', since I think by it's very nature a holistic scheduler introduces dependencies between services that were intended to be _logically_ independent.

  I do not favor creating a new and alternate way of doing
infrastructure orchestration in this position.  Rather I think it makes
sense to use essentially today's heat engine.

Today Heat is the only thing that takes a holistic view of
patterns/topologies/templates, and there are various pressures to expand
the mission of Heat.  A marquee expansion is to take on software
orchestration.  I think holistic infrastructure scheduling should be
downstream from the preparatory stage of software orchestration (the
other stage of software orchestration is the run-time action in and
supporting the resources themselves).  There are other pressures to
expand the mission of Heat too.  This leads to conflicting usages for
the word "heat": it can mean the infrastructure orchestration function
that is the main job of today's heat engine, or it can mean the full
expanded mission (whatever you think that should be).  I have been
mainly using "heat" in that latter sense, but I do not really want to
argue over naming of bits and assemblies of functionality.  Call them
whatever you want.  I am more interested in getting a useful arrangement
of functionality.  I have updated my picture at
https://docs.google.com/drawings/d/1Y_yyIpql5_cdC8116XrBHzn6GfP_g0NHTTG_W4o0R9U---
do you agree that the arrangement of functionality makes sense?

Candidly, no.

As proposed, the software configs contain directives like 'hosted_on: server_name'. (I don't know that I'm a huge fan of this design, but I don't think the exact details are relevant in this context.) There's no non-trivial processing in the preparatory stage of software orchestration that would require it to be performed before scheduling could occur.

Let's make sure we distinguish between doing holistic scheduling, which requires a priori knowledge of the resources to be created, and automatic scheduling, which requires psychic knowledge of the user's mind. (Did the user want to optimise for performance or availability? How would you infer that from the template?) There's nothing that happens while preparing the software configurations that's necessary for the former nor sufficient for the latter.

cheers,
Zane.

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to