On 11/03/2012 03:45 AM, Jan Kiszka wrote:> On 2012-11-03 05:43, Satoru Moriya wrote: >> We have some plans to migrate old enterprise/control systems which >> require low latency (msec order) to kvm virtualized environment. >> In order to satisfy the requirements, this patch adds realtime option >> to qemu: >> >> -realtime maxprio=<prio>,policy=<pol> >> >> This option change the scheduling policy and priority to realtime one >> (only vcpu thread) as specified with argument and mlock all qemu and >> guest memory. > > This patch breaks win32 build. All the POSIX stuff has to be pushed into > os-posix.c e.g. I'm introducing some os_prioritize() function for that > purpose, empty on win32. > > Then another question is how to get the parameters around. I played with > many options, ending up so far with > > /* called by os_prioritize */ > void qemu_init_realtime(int rt_sched_policy, int max_sched_priority); > /* called by threaded subsystems */ > bool qemu_realtime_is_enabled(void); > void qemu_realtime_get_parameters(int *policy, int *max_priority); > > all hosted by qemu-thread-*.c (empty/aborting on win32). This allows to > adjust subsystems to realtime without pushing all the parameters into > global variables.
Thanks. I'll re-implement the patch based on your comment. >> Benchmark: cyclictest >> https://rt.wiki.kernel.org/index.php/Cyclictest >> >> Command: >> $ cyclictest -p 99 -n -m -q -l 100000 >> >> Results: >> - no load (1:normal qemu, 2:realtime qemu) >> 1. T: 0 ( 544) P:99 I:1000 C:100000 Min: 11 Act: 32 Avg: 157 Max: 10029 >> 2. T: 0 ( 449) P:99 I:1000 C:100000 Min: 16 Act: 30 Avg: 29 Max: 540 >> >> - load (heavy network traffic) (3:normal qemu, 4: realtime qemu) >> 3. T: 0 (3455) P:99 I:1000 C:100000 Min: 10 Act: 38 Avg: 364 Max: 18394 >> 4. T: 0 ( 493) P:99 I:1000 C:100000 Min: 12 Act: 21 Avg: 76 Max: 10796 > > What are the numbers of "chrt -f -p 99 <vcpu_tid>" compared to this? I'm afraid that I don't have the results now. I'll post it later or next version. > My point is: This alone is not yet a good justification for the switch > and its current semantic. The approach of just raising the VCPU priority > is quite fragile without [V]CPU isolation. If you raise the VCPU over > its event threads, specifically the iothread, you risk starvation, e.g > during boot (BIOS will poll endlessly for PIT or disk). I think it doesn't happen if host has enough cpu core (at least vcpu+1). Is it wrong? > Yes, there is > /proc/sys/kernel/sched_rt_*, but this is what you typically disable when > doing realtime seriously, particularly if your guest doesn't idle during > operation. > > The model I would propose for mainline first is different: maxprio goes > to the event threads, maxprio - 1 to all vcpus (means that maxprio must > be > 1). This setup is less likely to starve and makes more sense > (interrupts must have higher prio than CPUs). Ok, I'll try your approach and test it. > However, that's also not yet generic as we will have scenarios where > only part of the event sources and VCPUs will be prioritized and the > rest shall remain low prio / SCHED_OTHER. Besides defining a way to > express such configurations, the problem is that they may not work > during guest boot. So some realtime profile switching concept may also > be needed. I haven't made up my mind on these issues yet. Not to speak > of the horrible mess of configuring a PREEMPT-RT host... > > What is clear, though, is that we need a reference show case for > realtime QEMU/KVM. One that is as easy to reproduce as possible, doesn't > depend on proprietary realtime guests and clearly shows the advantages > of all the needed changes for a reasonable use case. I'd like to discuss > this at the RT-KVM BoF at the KVM Forum next week. Will you and/or any > of your colleagues be there? Yes. I'll attend the BOF. Regards, Satoru