On 18/06/2019, 15:51, "Peter Xu" <pet...@redhat.com> wrote:
On Tue, Jun 18, 2019 at 12:25:43PM +0000, Cosmin Marin wrote: > Hi Peter, > > thanks for reviewing the patch. Indeed, I agree that it's almost impossible to determine which solution it's better from the scalability perspective. However, I feel that using per-vCPU timers is the only way for ensuring correctness of the throttling ratio. The thing is that your patch actually contains two changes: 1. use N timers instead of one. 2. remove throttle_thread_scheduled check, so we do the throttle always Here what I'm worried is that _maybe_ the 2nd item is the one that really helped. C: The removal of *throttle_thread_scheduled* is a consequence of the per-vCPU model only. In this model, each of the vCPUs schedules work just for itself (as part of the timer's firing callback) - there's no global point of control - therefore, the variable isn't helpful for scheduling anymore. Note that there is a side effect that we might queue more than one work on one specific cpu if we queue it too fast, but it does not block us from trying it out to identify which item (1 or 2 or both) really helped here. Then if we think that (queuing too much) is an issue then we can discuss on how to fix it since current patch will have this problem as well. C: I believe that in the per-vCPU timer implementation we cannot queue more than one piece of work because, here, the vCPU queues work for itself and that happens only when the timer fires - so, the two "states" - scheduling and sleeping - are mutually exclusive running from the same thread context. > > It's a bit unclear to me how the throttling ratio inconsistency can be fixed by using a single timer even avoiding the conditional timer re-arming. Could you provide more details about the use of a single timer ? C: I feel like in this case it will sleep too much running into a problem similar to the one solved by 90bb0c0; under heavy throttling more than one work item may be scheduled. It'll be simply a patch only contains the changes for item (2) above. To be explicit, it is: ------------------------------------------------------------------- diff --git a/cpus.c b/cpus.c index dde3b7b981..bb59675b98 100644 --- a/cpus.c +++ b/cpus.c @@ -792,7 +792,6 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque) qemu_mutex_unlock_iothread(); g_usleep(sleeptime_ns / 1000); /* Convert ns to us for usleep call */ qemu_mutex_lock_iothread(); - atomic_set(&cpu->throttle_thread_scheduled, 0); } static void cpu_throttle_timer_tick(void *opaque) @@ -805,10 +804,7 @@ static void cpu_throttle_timer_tick(void *opaque) return; } CPU_FOREACH(cpu) { - if (!atomic_xchg(&cpu->throttle_thread_scheduled, 1)) { - async_run_on_cpu(cpu, cpu_throttle_thread, - RUN_ON_CPU_NULL); - } + async_run_on_cpu(cpu, cpu_throttle_thread, RUN_ON_CPU_NULL); } pct = (double)cpu_throttle_get_percentage()/100; diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 5ee0046b62..0bd34fbb70 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -439,11 +439,6 @@ struct CPUState { /* shared by kvm, hax and hvf */ bool vcpu_dirty; - /* Used to keep track of an outstanding cpu throttle thread for migration - * autoconverge - */ - bool throttle_thread_scheduled; - bool ignore_memory_transaction_failures; struct hax_vcpu_state *hax_vcpu; ------------------------------------------------------------------- Regards, -- Peter Xu