On Tue, Oct 9, 2018 at 5:32 PM, Sylvain Bauza <sylvain.ba...@gmail.com> wrote: > > > Le mar. 9 oct. 2018 à 17:09, Balázs Gibizer > <balazs.gibi...@ericsson.com> a écrit : >> >> >> On Tue, Oct 9, 2018 at 4:56 PM, Sylvain Bauza >> <sylvain.ba...@gmail.com> >> wrote: >> > >> > >> > Le mar. 9 oct. 2018 à 16:39, Eric Fried <openst...@fried.cc> a >> > écrit : >> >> IIUC, the primary thing the force flag was intended to do - allow >> an >> >> instance to land on the requested destination even if that means >> >> oversubscription of the host's resources - doesn't happen anymore >> >> since >> >> we started making the destination claim in placement. >> >> >> >> IOW, since pike, you don't actually see a difference in behavior >> by >> >> using the force flag or not. (If you do, it's more likely a bug >> than >> >> what you were expecting.) >> >> >> >> So there's no reason to keep it around. We can remove it in a new >> >> microversion (or not); but even in the current microversion we >> need >> >> not >> >> continue making convoluted attempts to observe it. >> >> >> >> What that means is that we should simplify everything down to >> ignore >> >> the >> >> force flag and always call GET /a_c. Problem solved - for nested >> >> and/or >> >> sharing, NUMA or not, root resources or no, on the source and/or >> >> destination. >> >> >> > >> > >> > While I tend to agree with Eric here (and I commented on the review >> > accordingly by saying we should signal the new behaviour by a >> > microversion), I still think we need to properly advertise this, >> > adding openstack-operators@ accordingly. >> >> Question for you as well: if we remove (or change) the force flag in >> a >> new microversion then how should the old microversions behave when >> nested allocations would be required? >> > > In that case (ie. old microversions with either "force=None and > target" or 'force=True', we should IMHO not allocate any migration. > Thoughts ?
Do you mean on old microversions implement option #D) ? Cheers, gibi > >> Cheers, >> gibi >> >> > Disclaimer : since we have gaps on OSC, the current OSC behaviour >> > when you "openstack server live-migrate <target>" is to *force* the >> > destination by not calling the scheduler. Yeah, it sucks. >> > >> > Operators, what are the exact cases (for those running clouds newer >> > than Mitaka, ie. Newton and above) when you make use of the --force >> > option for live migration with a microversion newer or equal 2.29 ? >> > In general, even in the case of an emergency, you still want to >> make >> > sure you don't throw your compute under the bus by massively >> > migrating instances that would create an undetected snowball effect >> > by having this compute refusing new instances. Or are you disabling >> > the target compute service first and throw your pet instances up >> > there ? >> > >> > -Sylvain >> > >> > >> > >> >> -efried >> >> >> >> On 10/09/2018 04:40 AM, Balázs Gibizer wrote: >> >> > Hi, >> >> > >> >> > Setup >> >> > ----- >> >> > >> >> > nested allocation: an allocation that contains resources from >> one >> >> or >> >> > more nested RPs. (if you have better term for this then please >> >> suggest). >> >> > >> >> > If an instance has nested allocation it means that the compute, >> it >> >> > allocates from, has a nested RP tree. BUT if a compute has a >> >> nested RP >> >> > tree it does not automatically means that the instance, >> allocating >> >> from >> >> > that compute, has a nested allocation (e.g. bandwidth inventory >> >> will be >> >> > on a nested RPs but not every instance will require bandwidth) >> >> > >> >> > Afaiu, as soon as we have NUMA modelling in place the most >> trivial >> >> > servers will have nested allocations as CPU and MEMORY >> inverntory >> >> will >> >> > be moved to the nested NUMA RPs. But NUMA is still in the >> future. >> >> > >> >> > Sidenote: there is an edge case reported by bauzas when an >> instance >> >> > allocates _only_ from nested RPs. This was discussed on last >> >> Friday and >> >> > it resulted in a new patch[0] but I would like to keep that >> >> discussion >> >> > separate from this if possible. >> >> > >> >> > Sidenote: the current problem somewhat related to not just >> nested >> >> PRs >> >> > but to sharing RPs as well. However I'm not aiming to implement >> >> sharing >> >> > support in Nova right now so I also try to keep the sharing >> >> disscussion >> >> > separated if possible. >> >> > >> >> > There was already some discussion on the Monday's scheduler >> >> meeting but >> >> > I could not attend. >> >> > >> >> >> http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-08-14.00.log.html#l-20 >> >> > >> >> > >> >> > The meat >> >> > -------- >> >> > >> >> > Both live-migrate[1] and evacuate[2] has an optional force flag >> on >> >> the >> >> > nova REST API. The documentation says: "Force <the action> by >> not >> >> > verifying the provided destination host by the scheduler." >> >> > >> >> > Nova implements this statement by not calling the scheduler if >> >> > force=True BUT still try to manage allocations in placement. >> >> > >> >> > To have allocation on the destination host Nova blindly copies >> the >> >> > instance allocation from the source host to the destination host >> >> during >> >> > these operations. Nova can do that as 1) the whole allocation is >> >> > against a single RP (the compute RP) and 2) Nova knows both the >> >> source >> >> > compute RP and the destination compute RP. >> >> > >> >> > However as soon as we bring nested allocations into the picture >> >> that >> >> > blind copy will not be feasible. Possible cases >> >> > 0) The instance has non-nested allocation on the source and >> would >> >> need >> >> > non nested allocation on the destination. This works with blindy >> >> copy >> >> > today. >> >> > 1) The instance has a nested allocation on the source and would >> >> need a >> >> > nested allocation on the destination as well. >> >> > 2) The instance has a non-nested allocation on the source and >> would >> >> > need a nested allocation on the destination. >> >> > 3) The instance has a nested allocation on the source and would >> >> need a >> >> > non nested allocation on the destination. >> >> > >> >> > Nova cannot generate nested allocations easily without >> >> reimplementing >> >> > some of the placement allocation candidate (a_c) code. However I >> >> don't >> >> > like the idea of duplicating some of the a_c code in Nova. >> >> > >> >> > Nova cannot detect what kind of allocation (nested or >> non-nested) >> >> an >> >> > instance would need on the destination without calling placement >> >> a_c. >> >> > So knowing when to call placement is a chicken and egg problem. >> >> > >> >> > Possible solutions: >> >> > A) fail fast >> >> > ------------ >> >> > 0) Nova can detect that the source allocatioin is non-nested and >> >> try >> >> > the blindy copy and it will succeed. >> >> > 1) Nova can detect that the source allocaton is nested and fail >> the >> >> > operation >> >> > 2) Nova only sees a non nested source allocation. Even if the >> dest >> >> RP >> >> > tree is nested it does not mean that the allocation will be >> >> nested. We >> >> > cannot fail fast. Nova can try the blind copy and allocate every >> >> > resources from the root RP of the destination. If the instance >> >> require >> >> > nested allocation instead the claim will fail in placement. So >> >> nova can >> >> > fail the operation a bit later than in 1). >> >> > 3) Nova can detect that the source allocation is nested and fail >> >> the >> >> > operation. However and enhanced blind copy that tries to >> allocation >> >> > everything from the root RP on the destinaton would have worked. >> >> > >> >> > B) Guess when to ignore the force flag and call the scheduler >> >> > ------------------------------------------------------------- >> >> > 0) keep the blind copy as it works >> >> > 1) Nova detect that the source allocation is nested. Ignores the >> >> force >> >> > flag and calls the scheduler that will call placement a_c. Move >> >> > operation can succeed. >> >> > 2) Nova only sees a non nested source allocation so it will fall >> >> back >> >> > to blind copy and fails at the claim on destination. >> >> > 3) Nova detect that the source allocation is nested. Ignores the >> >> force >> >> > flag and calls the scheduler that will call placement a_c. Move >> >> > operation can succeed. >> >> > >> >> > This solution would be against the API doc that states nova does >> >> not >> >> > call the scheduler if the operation is forced. However in case >> of >> >> force >> >> > live-migration Nova already verifies the target host from >> couple of >> >> > perspective in [3]. >> >> > This solution is alreay proposed for live-migrate in [4] and for >> >> > evacuate in [5] so the complexity of the solution can be seen in >> >> the >> >> > reviews. >> >> > >> >> > C) Remove the force flag from the API in a new microversion >> >> > ----------------------------------------------------------- >> >> > 0)-3): all cases would call the scheduler to verify the target >> >> host and >> >> > generate the nested (or non-nested) allocation. >> >> > We would still need an agreed behavior (from A), B), D)) for the >> >> old >> >> > microversions as the todays code creates inconsistent allocation >> >> in #1) >> >> > and #3) by ignoring the resource from the nested RP. >> >> > >> >> > D) Do not manage allocations in placement for forced operation >> >> > -------------------------------------------------------------- >> >> > Force flag is considered as a last resort tool for the admin to >> >> move >> >> > VMs around. The API doc has a fat warning about the danger of >> it. >> >> So >> >> > Nova can simply ignore resource allocation task if force=True. >> Nova >> >> > would delete the source allocation and does not create any >> >> allocation >> >> > on the destination host. >> >> > >> >> > This is a simple but dangerous solution but it is what the force >> >> flag >> >> > is all about, move the server against all the built in safeties. >> >> (If >> >> > the admin needs the safeties she can set force=False and still >> >> specify >> >> > the destination host) >> >> > >> >> > I'm open to any suggestions. >> >> > >> >> > Cheers, >> >> > gibi >> >> > >> >> > [0] https://review.openstack.org/#/c/608298/ >> >> > [1] >> >> > >> >> >> https://developer.openstack.org/api-ref/compute/#live-migrate-server-os-migratelive-action >> >> > [2] >> >> > >> >> >> https://developer.openstack.org/api-ref/compute/#evacuate-server-evacuate-action >> >> > [3] >> >> > >> >> >> https://github.com/openstack/nova/blob/c5a7002bd571379818c0108296041d12bc171728/nova/conductor/tasks/live_migrate.py#L97 >> >> > [4] https://review.openstack.org/#/c/605785 >> >> > [5] https://review.openstack.org/#/c/606111 >> >> > >> >> > >> >> > >> >> >> __________________________________________________________________________ >> >> > OpenStack Development Mailing List (not for usage questions) >> >> > Unsubscribe: >> >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> >> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > >> >> >> >> >> __________________________________________________________________________ >> >> OpenStack Development Mailing List (not for usage questions) >> >> Unsubscribe: >> >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev