Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

Jay Pipes Wed, 18 Apr 2018 15:22:17 -0700

On 04/18/2018 04:52 PM, Eric Fried wrote:

I can't tell if you're being facetious, but this seems sane, albeit
complex.  It's also extensible as we come up with new and wacky affinity
semantics we want to support.


I was not being facetious.

I can't say I'm sold on requiring `proximity` qparams that cover every
granular group - that seems like a pretty onerous burden to put on the
user right out of the gate.

I did that because Matt said he wanted no default/implicit behaviour --everything should be explicit.


> That said, the idea of not having a default

is quite appealing.  Perhaps as a first pass we can require a single
?proximity={isolate|any} and build on it to support group numbers (etc.)
in the future.


Here's my problem.

I have a feeling we're just going to go back and forth on this, as wehave for weeks now, and not reach any conclusion that is satisfactory toeveryone. And we'll delay, yet again, getting functionality into thisrelease that serves 90% of use cases because we are obsessing over the0.01% of use cases that may pop up later.


Best,
-jay

One other thing inline below, not related to the immediate subject.

On 04/18/2018 12:40 PM, Jay Pipes wrote:

On 04/18/2018 11:58 AM, Matt Riedemann wrote:

On 4/18/2018 9:06 AM, Jay Pipes wrote:

"By default, should resources/traits submitted in different numbered
request groups be supplied by separate resource providers?"


Without knowing all of the hairy use cases, I'm trying to channel my
inner sdague and some of the similar types of discussions we've had to
changes in the compute API, and a lot of the time we've agreed that we
shouldn't assume a default in certain cases.

So for this case, if I'm requesting numbered request groups, why
doesn't the API just require that I pass a query parameter telling it
how I'd like those requests to be handled, either via affinity or
anti-affinity

So, you're thinking maybe something like this?

1) Get me two dedicated CPUs. One of those dedicated CPUs must have AVX2
capabilities. They must be on different child providers (different NUMA
cells that are providing those dedicated CPUs).

GET /allocation_candidates?

  resources1=PCPU:1&required1=HW_CPU_X86_AVX2
&resources2=PCPU:1
&proximity=isolate:1,2

2) Get me four dedicated CPUs. Two of those dedicated CPUs must have
AVX2 capabilities. Two of the dedicated CPUs must have the SSE 4.2
capability. They may come from the same provider (NUMA cell) or
different providers.

GET /allocation_candidates?

  resources1=PCPU:2&required1=HW_CPU_X86_AVX2
&resources2=PCPU:2&required2=HW_CPU_X86_SSE42
&proximity=any:1,2

3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
separate physical function providers which have different traits marking
separate physical networks. The dedicated CPUs must come from the same
provider tree in which the physical function providers reside.

GET /allocation_candidates?

  resources1=PCPU:2
&resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
&resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
&proximity=isolate:2,3
&proximity=same_tree:1,2,3

3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
separate physical function providers which have different traits marking
separate physical networks. The dedicated CPUs must come from the same
provider *subtree* in which the second group of VF resources are sourced.

GET /allocation_candidates?

  resources1=PCPU:2
&resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
&resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
&proximity=isolate:2,3
&proximity=same_subtree:1,3


The 'same_subtree' concept requires a way to identify how far up the
common ancestor can be.  Otherwise, *everything* is in the same subtree.
  You could arbitrarily say "one step down from the root", but that's not
very flexible.  Allowing the user to specify a *number* of steps down
from the root is getting closer, but it requires the user to have an
understanding of the provider tree's exact structure, which is not ideal.

The idea I've been toying with here is "common ancestor by trait".  For
example, you would tag your NUMA node providers with trait NUMA_ROOT,
and then your request would include:

   ...
   &proximity=common_ancestor_by_trait:NUMA_ROOT:1,3


4) Get me 4 SR-IOV VFs. 2 VFs should be sourced from a provider that is
decorated with the CUSTOM_PHYSNET_A trait. 2 VFs should be sourced from
a provider that is decorated with the CUSTOM_PHYSNET_B trait. For HA
purposes, none of the VFs should be sourced from the same provider.
However, the VFs for each physical network should be within the same
subtree (NUMA cell) as each other.

GET /allocation_candidates?

  resources1=SRIOV_NET_VF:1&required1=CUSTOM_PHYSNET_A
&resources2=SRIOV_NET_VF:1&required2=CUSTOM_PHYSNET_A
&resources3=SRIOV_NET_VF:1&required3=CUSTOM_PHYSNET_B
&resources4=SRIOV_NET_VF:1&required4=CUSTOM_PHYSNET_B
&proximity=isolate:1,2,3,4
&proximity=same_subtree:1,2
&proximity=same_subtree:3,4

We can go even deeper if you'd like, since NFV means "never-ending
feature velocity". Just let me know.

-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

Reply via email to