From: naroahlee <naroah...@gmail.com>

   Modify: xen/common/sched_rt.c runq_tickle(): Check not_tickled Mask
for a Cache-Preferenced-PCPU  

The bug is introduced in Xen 4.7 when we converted RTDS scheduler from
quantum-driven model to event-driven model. We assumed whenever 
runq_tickle() is invoked, we will find a PCPU via a NOT-tickled mask.
However, in runq_tickle(): Case1: Pick Cache Preference
IDLE-PCPU is NOT masked by the not-tickled CPU mask.

Buggy behavior:
When two VCPUs tried to tickle a IDLE-VCPU, which is now on their
cache-preference PCPU, these two VCPU will tickle the same PCPU in a row.
However, only one VCPU is guranteed to be scheduled, because runq_pick()
would be executed only once in rt_schedule().
That means, another VCPU will lost (be descheduled) a Period.

Bug Analysis:
We need to exclude tickled VCPUs when trying to evaluate runq_tickle() case 1

Signed-off-by: Haoran Li <naroah...@gmail.com>
---
 xen/common/sched_rt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 1b30014..777192f 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1175,7 +1175,8 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
     cpumask_andnot(&not_tickled, &not_tickled, &prv->tickled);
 
     /* 1) if new's previous cpu is idle, kick it for cache benefit */
-    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
+    if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) &&
+         cpumask_test_cpu(new->vcpu->processor, &not_tickled))
     {
         SCHED_STAT_CRANK(tickled_idle_cpu);
         cpu_to_tickle = new->vcpu->processor;
-- 
1.9.1

---
CC: <men...@cis.upenn.edu>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to