Stephen, thanks for summing it all up! I am guessing that a blueprint or updates to an existing blueprint will be next. We currently have a patch that introduces a second pin_set to nova.conf and solves problem1 and 2 in ocata. But that might be overlooking a couple of cases we do not care about/did not come across yet. Next to the text, that could serve as a discussion basis for what will be imlpemented eventually.
I am happy because the two problems where acknowledged, the placement strategy of the threads was discussed/reviewed with some input from kvm, and we already talked about possible solutions. So things are moving ;) regards, Henning Am Thu, 29 Jun 2017 17:59:41 +0100 schrieb <sfinu...@redhat.com>: > On Tue, 2017-06-20 at 09:48 +0200, Henning Schild wrote: > > Hi, > > > > We are using OpenStack for managing realtime guests. We modified > > it and contributed to discussions on how to model the realtime > > feature. More recent versions of OpenStack have support for > > realtime, and there are a few proposals on how to improve that > > further. > > > > ... > > I'd put off working my way through this thread until I'd time to sit > down and read it in full. Here's what I'm seeing by way of summaries > _so far_. > > # Current situation > > I think this tree (sans 'hw' prefixes for brevity) represents the > current situation around flavor extra specs and image meta. Pretty > much everything hangs off cpu_policy=dedicated. Correct me if I'm > wrong. > > cpu_policy > ╞═> shared > ╘═> dedicated > ├─> cpu_thread_policy > │ ╞═> prefer > │ ╞═> isolate > │ ╘═> require > ├─> emulator_threads_policy (*) > │ ╞═> share > │ ╘═> isolate > └─> cpu_realtime > ╞═> no > ╘═> yes > └─> cpu_realtime_mask > ╘═> (a mask of guest cores) > > (*) this one isn't configurable via images. I never really got why > but meh. > > There's also some host-level configuration options > > vcpu_pin_set > ╘═> (a list of host cores that nova can use) > > Finally, there's some configuration you can do with your choice of > kernel and kernel options (e.g. 'isolcpus'). > > For real time workloads, the expectation would be that you would set: > > cpu_policy > ╘═> dedicated > ├─> cpu_thread_policy > │ ╘═> isolate > ├─> emulator_threads_policy > │ ╘═> isolate > └─> cpu_realtime > ╘═> yes > └─> cpu_realtime_mask > ╘═> (a mask of guest cores) > > That would result in a host that would use N+1 vCPUs, where N > corresponds to the number of instance cores. Of the N cores, the set > masked by 'cpu_realtime_mask' will be non-realtime. The remainder > will be realtime. > > # The Problem(s) > > I'm going to thread this to capture the arguments and counter > arguments: > > ## Problem 1 > > henning.schild suggested that the current implementation of > 'emulator_thread_policy' is too resource intensive, as the 1 core > generally has a minimal workload for entire guests. This can > significantly limit the number of guests that can be booted per host, > particularly for guests with smaller numbers of cores. Instead, he > has implemented a 'emulator_pin_set' host-level option, which > complements 'vcpu_pin_set'. This allows us to "pool" emulator > threads, similar to how vCPU threads behave with 'cpu_policy=shared'. > He suggests this be adopted by nova. > > sahid seconded this, but suggests 'emulator_pin_set' be renamed > 'cpu_emulator_threads_mask' and work as a mask of 'vcpu_pin_set'. > He also suggested making a similarly-named flavor property, that > would allow the user to use one of their cores for non-realtime > > henning.schild suggested a set would still be better, but that > 'vpu_pin_set' be renamed to 'pin_set', as it would no longer be > for only vCPUs > > cfriesen seconded henning.schild's position but was not entirely > convinced that sharing emulator threads on a single pCPU is > guaranteed to be safe, for example if one instance starts seriously > hammering on I/O or does live migration or something. He suggested > that an additional option, 'rt_emulator_overcommit_ratio' be added to > make overcommitting explicit. In addition, he suggested making the > flavor property a bitmask > > sahid questioned the need for an overcommit ratio, given that > there is no overcommit of the hosts. An operator could synthesize a > suitable value for 'emulator_pin_set'/'cpu_emulator_threads_mask'. He > also disagreed with the suggestion that the flavor property be a > bitmask as the only set is that of the vCPUs. > > cfriesen clarifies to point out how a few instances with > many vCPUs will have more overhead requirements than many instances > with few vCPUs. We need to be able to fail scheduling if the emulator > thread cores are oversubscribed. > > ## Problem 2 > > henning.schild suggests that hosts should be able to handle both RT > and non-RT instances. This could be achieved through multiple > instances of nova > > sahid points out that the recommendation is to use host aggregates > to separate the two. > > henning.schild states that hosts with RT kernels can manage > non-RT guests just fine. However, if using host aggregates is the > recommendation then it should be possible to run multiple nova > instances on a host, because dedicating an entire machine is not > viable for smaller operations. cfriesen seconds this perspective, > though not this solution. > > # Solutions > > Thus far, we've no clear conclusions on directions to go, so I've > took a stab below. Henning, Sahid, Chris: does the above/below make > sense, and is there anything we need to further clarify? > > # Problem 1 > > From the above, there are 3-4 work items: > > - Add a 'emulator_pin_set' or 'cpu_emulator_threads_mask' > configuration option > > - If using a mask, rename 'vcpu_pin_set' to 'pin_set' (or, better, > 'usable_cpus') > > - Add a 'emulator_overcommit_ratio', which will do for emulator > threads what the other ratios do for vCPUs and memory > > - Deprecate 'hw:emulator_thread_policy'??? > > # Problem 2 > > No clear conclusions yet? > > --- > > Cheers, > Stephen __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev