On Tue, Dec 16, 2014 at 10:09:48AM +0800, Yuyang Du wrote: > > Sasha, it might be helpful to see this_load is from: > > this_load1: this_load = target_load(this_cpu, idx); > > or > > this_load2: this_load += effective_load(tg, this_cpu, -weight, -weight); > > It really does not seem to be this_load1, while the calc of effective_load is > a bit > complicated to see what the problem is.
Hi all, I finally managed to reproduce this, but not by trinity, just by keeping rebooting. Indeed, the problem is from: this_load2: this_load += effective_load(tg, this_cpu, -weight, -weight); After digging into effective_load(), the root cause is: wl = (w * tg->shares) / W; if we have negative w, then it will be cast to unsigned long, and then may or may not overflow, and end up an insane number. I tried this in userspace, interestingly if we have: wl = w * tg->shares; wl /= W; the result is ok, but not ok with the lines combined as the original one. Anyway, the following patch can fix this. --- Subject: [PATCH] sched: Fix long and unsigned long multiplication error in effective_load In effective_load, we have (long w * unsigned long tg->shares) / long W, when w is negative, it is cast to unsigned long and hence the product is insanely large. Fix this by casting tg->shares to long. Reported-by: Sasha Levin <sasha.le...@oracle.com> Signed-off-by: Yuyang Du <yuyang...@intel.com> --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index df2cdf7..6b99659 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4424,7 +4424,7 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg) * wl = S * s'_i; see (2) */ if (W > 0 && w < W) - wl = (w * tg->shares) / W; + wl = (w * (long)tg->shares) / W; else wl = tg->shares; -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/