On Tue, 2017-06-20 at 09:48 +0200, Henning Schild wrote: > Hi, > > We are using OpenStack for managing realtime guests. We modified > it and contributed to discussions on how to model the realtime > feature. More recent versions of OpenStack have support for realtime, > and there are a few proposals on how to improve that further. > > ...
I'd put off working my way through this thread until I'd time to sit down and read it in full. Here's what I'm seeing by way of summaries _so far_. # Current situation I think this tree (sans 'hw' prefixes for brevity) represents the current situation around flavor extra specs and image meta. Pretty much everything hangs off cpu_policy=dedicated. Correct me if I'm wrong. cpu_policy ╞═> shared ╘═> dedicated ├─> cpu_thread_policy │ ╞═> prefer │ ╞═> isolate │ ╘═> require ├─> emulator_threads_policy (*) │ ╞═> share │ ╘═> isolate └─> cpu_realtime ╞═> no ╘═> yes └─> cpu_realtime_mask ╘═> (a mask of guest cores) (*) this one isn't configurable via images. I never really got why but meh. There's also some host-level configuration options vcpu_pin_set ╘═> (a list of host cores that nova can use) Finally, there's some configuration you can do with your choice of kernel and kernel options (e.g. 'isolcpus'). For real time workloads, the expectation would be that you would set: cpu_policy ╘═> dedicated ├─> cpu_thread_policy │ ╘═> isolate ├─> emulator_threads_policy │ ╘═> isolate └─> cpu_realtime ╘═> yes └─> cpu_realtime_mask ╘═> (a mask of guest cores) That would result in a host that would use N+1 vCPUs, where N corresponds to the number of instance cores. Of the N cores, the set masked by 'cpu_realtime_mask' will be non-realtime. The remainder will be realtime. # The Problem(s) I'm going to thread this to capture the arguments and counter arguments: ## Problem 1 henning.schild suggested that the current implementation of 'emulator_thread_policy' is too resource intensive, as the 1 core generally has a minimal workload for entire guests. This can significantly limit the number of guests that can be booted per host, particularly for guests with smaller numbers of cores. Instead, he has implemented a 'emulator_pin_set' host-level option, which complements 'vcpu_pin_set'. This allows us to "pool" emulator threads, similar to how vCPU threads behave with 'cpu_policy=shared'. He suggests this be adopted by nova. sahid seconded this, but suggests 'emulator_pin_set' be renamed 'cpu_emulator_threads_mask' and work as a mask of 'vcpu_pin_set'. He also suggested making a similarly-named flavor property, that would allow the user to use one of their cores for non-realtime henning.schild suggested a set would still be better, but that 'vpu_pin_set' be renamed to 'pin_set', as it would no longer be for only vCPUs cfriesen seconded henning.schild's position but was not entirely convinced that sharing emulator threads on a single pCPU is guaranteed to be safe, for example if one instance starts seriously hammering on I/O or does live migration or something. He suggested that an additional option, 'rt_emulator_overcommit_ratio' be added to make overcommitting explicit. In addition, he suggested making the flavor property a bitmask sahid questioned the need for an overcommit ratio, given that there is no overcommit of the hosts. An operator could synthesize a suitable value for 'emulator_pin_set'/'cpu_emulator_threads_mask'. He also disagreed with the suggestion that the flavor property be a bitmask as the only set is that of the vCPUs. cfriesen clarifies to point out how a few instances with many vCPUs will have more overhead requirements than many instances with few vCPUs. We need to be able to fail scheduling if the emulator thread cores are oversubscribed. ## Problem 2 henning.schild suggests that hosts should be able to handle both RT and non-RT instances. This could be achieved through multiple instances of nova sahid points out that the recommendation is to use host aggregates to separate the two. henning.schild states that hosts with RT kernels can manage non-RT guests just fine. However, if using host aggregates is the recommendation then it should be possible to run multiple nova instances on a host, because dedicating an entire machine is not viable for smaller operations. cfriesen seconds this perspective, though not this solution. # Solutions Thus far, we've no clear conclusions on directions to go, so I've took a stab below. Henning, Sahid, Chris: does the above/below make sense, and is there anything we need to further clarify? # Problem 1 From the above, there are 3-4 work items: - Add a 'emulator_pin_set' or 'cpu_emulator_threads_mask' configuration option - If using a mask, rename 'vcpu_pin_set' to 'pin_set' (or, better, 'usable_cpus') - Add a 'emulator_overcommit_ratio', which will do for emulator threads what the other ratios do for vCPUs and memory - Deprecate 'hw:emulator_thread_policy'??? # Problem 2 No clear conclusions yet? --- Cheers, Stephen __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev