On 10/23/2014 07:57 PM, Elzur, Uri wrote:
Today, OpenStack makes placement decision mainly based on Compute
demands (Scheduler is part of Nova). It also uses some info provided
about platform’s Compute capabilities. But for a given application
(consists of some VMs, some Network appliances, some storage etc),
Nova/Scheduler has no way to figure out relative placement of Network
devices (virtual appliances, SFC) and/or Storage devices (which is also
network born in many cases) in reference to the Compute elements. This
makes it harder to provide SLA, support certain policies (e.g. HA or
keeping all of these elements within a physical boundary of your choice,
or within a given network physical boundary and guarantee storage
proximity, for example. It also makes it harder to optimize resource
utilization level, which increases the cost and may cause Openstack to
be less competitive on TCO.
Another aspect of the issue, is that in order, to lower the cost per
unit of compute (or said better per unit of Application), it is
essential to pack tighter. This increases infrastructure utilization but
also makes interference a more important phenomenon (aka Nosy neighbor).
SLA requests, SLA guarantees and placement based on ability to provide
desired SLA are required.
We’d like to suggest moving a bit faster on making OpenStack a more
compelling stack for Compute/Network/Storage, capable of supporting
Telco/NFV and other usage models, and creating the foundation for
providing very low cost platform, more competitive with large cloud
deployment.
How do you suggest moving faster?
Also, when you say things like "more competitive with large cloud
deployment" you need to tell us what you are comparing OpenStack to, and
what cost factors you are using. Otherwise, it's just a statement with
no context.
The concern is that any scheduler change will take long time. Folks
closer to the Scheduler work, have already pointed out we first need to
stabilize the API between Nova and the Scheduler, before we can talk
about a split (e.g. Gantt). So it may take till late in 2016 (best
case?), to get this kind of broader Application level functionality in
the OpenStack scheduler .
I'm not entirely sure where late in 2016 comes from? Could you elaborate?
We’d like to bring it up in the coming design summit. Where do you think
it needs to be discussed: cross project tack? Scheduler discussion? Other?
I’ve just added a proposed item 17.1 to the
https://etherpad.openstack.org/p/kilo-crossproject-summit-topics
1.
2.“present Application’s Network and Storage requirements, coupled with
infrastructure capabilities and status (e.g. up/dn
This is the kind of thing that was nixed as an idea last go around with
the "nic-state-aware-scheduler":
https://review.openstack.org/#/c/87978/
You are coupling service state monitoring with placement decisions, and
by doing so, you will limit the scale of the system considerably. We
need improvements to our service state monitoring, for sure, including
the ability to have much more fine-grained definition of what a service
is. But I am 100% against adding the concept of checking service state
*during* placement decisions.
Service state monitoring (it's called the servicegroup API in Nova) can
and should notify the scheduler of important changes to the state of
resource providers, but I'm opposed to making changes to the scheduler
that would essentially make a placement decision and then immediately go
and check a link for UP/DOWN state before "finalizing" the claim of
resources on the resource provider.
, utilization levels) and placement policy (e.g. proximity, HA)
I understand proximity (affinity/anti-affinity), but what does HA have
to do with placement policy? Could you elaborate a bit more on that?
> to get optimized placement
decisions accounting for all application elements (VMs, virt Network
appliances, Storage) vs. Compute only”
Yep. These are all simply inputs to the scheduler's placement decision
engine. We need:
a) A way of providing these inputs to the launch request without
polluting a cloud user's view of the cloud -- remember we do NOT want
users of the Nova API to essentially need to understand the exact layout
of the cloud provider's datacenter. That's definitively anti-cloudy :)
So, we need a way of providing generic inputs to the scheduler that the
scheduler can translate into specific inputs because the scheduler would
know the layout of the datacenter...
b) Simple condition engine that would be able to understand the inputs
(requested proximity to a storage cluster used by applications running
in the instance, for example) with information the scheduler can query
for about the topology of the datacenter's network and storage.
Work on b) involves the following foundational blueprints:
https://review.openstack.org/#/c/127609/
https://review.openstack.org/#/c/127610/
https://review.openstack.org/#/c/127612/
Looking forward to a cross-project or Nova session at the summit. Either
would work for me, and I encourage you and others to response to this ML
thread with thoughts about the scheduler and resource tracker space so I
may summarize the thoughts on an etherpad before the summit.
All the best,
-jay
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev