On 6/12/19 9:33 AM, Julien Desfossez wrote:
After reading more traces and trying to understand why only untagged tasks are starving when there are cpu-intensive tasks running on the same set of CPUs, we noticed a difference in behavior in ‘pick_task’. In the case where ‘core_cookie’ is 0, we are supposed to only prefer the tagged task if it’s priority is higher, but when the priorities are equal we prefer it as well which causes the starving. ‘pick_task’ is biased toward selecting its first parameter in case of equality which in this case was the ‘class_pick’ instead of ‘max’. Reversing the order of the parameter solves this issue and matches the expected behavior. So we can get rid of this vruntime_boost concept. We have tested the fix below and it seems to work well with tagged/untagged tasks.
My 2 DB instance runs with this patch are better with CORESCHED_STALL_FIX than NO_CORESCHED_STALL_FIX in terms of performance, std deviation and idleness. May be enable it by default? NO_CORESCHED_STALL_FIX: users %stdev %gain %idle 16 25 -42.4 73 24 32 -26.3 67 32 0.2 -48.9 62 CORESCHED_STALL_FIX: users %stdev %gain %idle 16 6.5 -23 70 24 0.6 -17 60 32 1.5 -30.2 52