From: naroahlee <naroah...@gmail.com> Modify: xen/common/sched_rt.c runq_tickle(): Check not_tickled Mask for a Cache-Preferenced-PCPU
The bug is introduced in Xen 4.7 when we converted RTDS scheduler from quantum-driven model to event-driven model. We assumed whenever runq_tickle() is invoked, we will find a PCPU via a NOT-tickled mask. However, in runq_tickle(): Case1: Pick Cache Preference IDLE-PCPU is NOT masked by the not-tickled CPU mask. Buggy behavior: When two VCPUs tried to tickle a IDLE-VCPU, which is now on their cache-preference PCPU, these two VCPU will tickle the same PCPU in a row. However, only one VCPU is guranteed to be scheduled, because runq_pick() would be executed only once in rt_schedule(). That means, another VCPU will lost (be descheduled) a Period. Bug Analysis: We need to exclude tickled VCPUs when trying to evaluate runq_tickle() case 1 Signed-off-by: Haoran Li <naroah...@gmail.com> --- xen/common/sched_rt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 1b30014..777192f 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1175,7 +1175,8 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) cpumask_andnot(¬_tickled, ¬_tickled, &prv->tickled); /* 1) if new's previous cpu is idle, kick it for cache benefit */ - if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) ) + if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) && + cpumask_test_cpu(new->vcpu->processor, ¬_tickled)) { SCHED_STAT_CRANK(tickled_idle_cpu); cpu_to_tickle = new->vcpu->processor; -- 1.9.1 --- CC: <men...@cis.upenn.edu> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel