On Tue, Oct 9, 2018 at 4:39 PM, Eric Fried <openst...@fried.cc> wrote: > IIUC, the primary thing the force flag was intended to do - allow an > instance to land on the requested destination even if that means > oversubscription of the host's resources - doesn't happen anymore > since > we started making the destination claim in placement.
Can we simply do that still by not creating allocation in placement during the move? (see option #D)) > > IOW, since pike, you don't actually see a difference in behavior by > using the force flag or not. (If you do, it's more likely a bug than > what you were expecting.) There is still difference between force=True and force=False today. When you say force=False nova calls placement a_c and placement try to satisfy requested resource, required traits, and aggregate membership. When you say force=True nova conductor takes the resource allocation from the source host and copies that blindly to the destination but does not check any traits or aggregate membership. So force=True is still ignores a lot of rules and safeties. > > So there's no reason to keep it around. We can remove it in a new > microversion (or not); but even in the current microversion we need > not > continue making convoluted attempts to observe it. If we remove it in a new microversion (option #C)) then we still need to define how to behave in the old microversions when nested allocation would be needed. I don't fully get what you mean by 'not continue making convoluted attempts to observe it.' > > What that means is that we should simplify everything down to ignore > the > force flag and always call GET /a_c. Problem solved - for nested > and/or > sharing, NUMA or not, root resources or no, on the source and/or > destination. If you do the force flag removal in a nw microversion that also means (at least to me) that you should not change the behavior of the force flag in the old microversions. Cheers, gibi > > -efried > > On 10/09/2018 04:40 AM, Balázs Gibizer wrote: >> Hi, >> >> Setup >> ----- >> >> nested allocation: an allocation that contains resources from one or >> more nested RPs. (if you have better term for this then please >> suggest). >> >> If an instance has nested allocation it means that the compute, it >> allocates from, has a nested RP tree. BUT if a compute has a nested >> RP >> tree it does not automatically means that the instance, allocating >> from >> that compute, has a nested allocation (e.g. bandwidth inventory >> will be >> on a nested RPs but not every instance will require bandwidth) >> >> Afaiu, as soon as we have NUMA modelling in place the most trivial >> servers will have nested allocations as CPU and MEMORY inverntory >> will >> be moved to the nested NUMA RPs. But NUMA is still in the future. >> >> Sidenote: there is an edge case reported by bauzas when an instance >> allocates _only_ from nested RPs. This was discussed on last Friday >> and >> it resulted in a new patch[0] but I would like to keep that >> discussion >> separate from this if possible. >> >> Sidenote: the current problem somewhat related to not just nested >> PRs >> but to sharing RPs as well. However I'm not aiming to implement >> sharing >> support in Nova right now so I also try to keep the sharing >> disscussion >> separated if possible. >> >> There was already some discussion on the Monday's scheduler meeting >> but >> I could not attend. >> >> http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-08-14.00.log.html#l-20 >> >> >> The meat >> -------- >> >> Both live-migrate[1] and evacuate[2] has an optional force flag on >> the >> nova REST API. The documentation says: "Force <the action> by not >> verifying the provided destination host by the scheduler." >> >> Nova implements this statement by not calling the scheduler if >> force=True BUT still try to manage allocations in placement. >> >> To have allocation on the destination host Nova blindly copies the >> instance allocation from the source host to the destination host >> during >> these operations. Nova can do that as 1) the whole allocation is >> against a single RP (the compute RP) and 2) Nova knows both the >> source >> compute RP and the destination compute RP. >> >> However as soon as we bring nested allocations into the picture that >> blind copy will not be feasible. Possible cases >> 0) The instance has non-nested allocation on the source and would >> need >> non nested allocation on the destination. This works with blindy >> copy >> today. >> 1) The instance has a nested allocation on the source and would >> need a >> nested allocation on the destination as well. >> 2) The instance has a non-nested allocation on the source and would >> need a nested allocation on the destination. >> 3) The instance has a nested allocation on the source and would >> need a >> non nested allocation on the destination. >> >> Nova cannot generate nested allocations easily without >> reimplementing >> some of the placement allocation candidate (a_c) code. However I >> don't >> like the idea of duplicating some of the a_c code in Nova. >> >> Nova cannot detect what kind of allocation (nested or non-nested) an >> instance would need on the destination without calling placement >> a_c. >> So knowing when to call placement is a chicken and egg problem. >> >> Possible solutions: >> A) fail fast >> ------------ >> 0) Nova can detect that the source allocatioin is non-nested and try >> the blindy copy and it will succeed. >> 1) Nova can detect that the source allocaton is nested and fail the >> operation >> 2) Nova only sees a non nested source allocation. Even if the dest >> RP >> tree is nested it does not mean that the allocation will be nested. >> We >> cannot fail fast. Nova can try the blind copy and allocate every >> resources from the root RP of the destination. If the instance >> require >> nested allocation instead the claim will fail in placement. So nova >> can >> fail the operation a bit later than in 1). >> 3) Nova can detect that the source allocation is nested and fail the >> operation. However and enhanced blind copy that tries to allocation >> everything from the root RP on the destinaton would have worked. >> >> B) Guess when to ignore the force flag and call the scheduler >> ------------------------------------------------------------- >> 0) keep the blind copy as it works >> 1) Nova detect that the source allocation is nested. Ignores the >> force >> flag and calls the scheduler that will call placement a_c. Move >> operation can succeed. >> 2) Nova only sees a non nested source allocation so it will fall >> back >> to blind copy and fails at the claim on destination. >> 3) Nova detect that the source allocation is nested. Ignores the >> force >> flag and calls the scheduler that will call placement a_c. Move >> operation can succeed. >> >> This solution would be against the API doc that states nova does not >> call the scheduler if the operation is forced. However in case of >> force >> live-migration Nova already verifies the target host from couple of >> perspective in [3]. >> This solution is alreay proposed for live-migrate in [4] and for >> evacuate in [5] so the complexity of the solution can be seen in the >> reviews. >> >> C) Remove the force flag from the API in a new microversion >> ----------------------------------------------------------- >> 0)-3): all cases would call the scheduler to verify the target host >> and >> generate the nested (or non-nested) allocation. >> We would still need an agreed behavior (from A), B), D)) for the old >> microversions as the todays code creates inconsistent allocation in >> #1) >> and #3) by ignoring the resource from the nested RP. >> >> D) Do not manage allocations in placement for forced operation >> -------------------------------------------------------------- >> Force flag is considered as a last resort tool for the admin to move >> VMs around. The API doc has a fat warning about the danger of it. So >> Nova can simply ignore resource allocation task if force=True. Nova >> would delete the source allocation and does not create any >> allocation >> on the destination host. >> >> This is a simple but dangerous solution but it is what the force >> flag >> is all about, move the server against all the built in safeties. (If >> the admin needs the safeties she can set force=False and still >> specify >> the destination host) >> >> I'm open to any suggestions. >> >> Cheers, >> gibi >> >> [0] https://review.openstack.org/#/c/608298/ >> [1] >> >> https://developer.openstack.org/api-ref/compute/#live-migrate-server-os-migratelive-action >> [2] >> >> https://developer.openstack.org/api-ref/compute/#evacuate-server-evacuate-action >> [3] >> >> https://github.com/openstack/nova/blob/c5a7002bd571379818c0108296041d12bc171728/nova/conductor/tasks/live_migrate.py#L97 >> [4] https://review.openstack.org/#/c/605785 >> [5] https://review.openstack.org/#/c/606111 >> >> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev