On Thu, 2016-06-02 at 04:03 +0800, Yuyang Du wrote: > On Wed, Jun 01, 2016 at 11:36:39AM +0200, Mike Galbraith wrote: > > > Yup. Up to this point, we don't have any disagreement. And I don't think > > > we > > > have any disagreement conceptually. What the next patch really does is: > > > > > > (1) we don't remove SD_BALANCE_WAKE as an important sched_domain flag, on > > > the contrary, we strengthen it. > > > > > > (2) the semantic of SD_BALANCE_WAKE is currently represented by > > > SD_WAKE_AFFINE, > > > we actually remove this representation. > > > > Nope, those two have different meanings. We pass SD_BALANCE_WAKE to > > identify a ttwu() wakeup, just as we pass SD_BALANCE_FORK to say we're > > waking a child. SD_WAKE_AFFINE means exactly what it says, but is only > > applicable to ttwu() wakeups. > > I don't disagree, but want to add that, SD_WAKE_AFFINE has no meaning that is > so > special and so important for anyone to use the flag to tune anything. If you > want > to do any SD_BALANCE_*, waker CPU is a valid candidate if allowed, that is it.
That flag lets the user specifically tell us that he doesn't want us to bounce his tasks around the box, cache misses be damned. The user may _know_ that say cross node migrations hurt his load more than help, and not want us to do that, thus expresses himself by turning the flag off at whatever level. People do that. You can force them to take other measures, but why do that? > IIUC your XXX mark and your comment "Prefer wake_affine over balance flags", > you > said the same thing: SD_WAKE_AFFINE should be part of SD_BALANCE_WAKE, and > should > be part of all SD_BALANCE_* flags, Peter wrote that, but I don't read it the way you do. I read as if the user wants the benefits of affine wakeups, he surely doesn't want us to send the wakee off to god know where on every wakeup _instead_ of waking affine, he wants to balance iff he can't have an affine wakeup. > > > (3) regarding the semantic of SD_WAKE_AFFINE, it is really not about > > > selecting > > > waker CPU or about the fast path. Conceptually, it is just saying the > > > waker > > > CPU is a valid and important candidate if SD_BALANCE_WAKE, which is > > > just so > > > obvious, so I don't think it deserves to be a separate sched_domain > > > flag. > > > > SD_WAKE_AFFINE being a separate domain flag, the user can turn it > > on/off... separately :) > > Sure, that is very true, :) But turning it off for what, that is a big > question mark. > We don't want a flag unless the flag is for goodness, and not a flag with big > question > mark. It has a good excuse to exist. > > > (4) the outcome is, if SD_BALANCE_WAKE, we definitely will/should try > > > waker CPU, > > > and if !SD_BALANCE_WAKE, we don't try waker CPU. So nothing > > > functional is > > > changed. > > > > If wake_wide() says we do not want an affine wakeup, but you apply > > SD_WAKE_AFFINE meaning to SD_BALANCE_WAKE and turn it on in ->flags, > > we'll give the user a free sample of full balance cost, no? > > Yes, and otherwise we don't select anything? That is just bad engough whether > worse > or not. So the whole fuss I made is really that this is a right thing to > start with. :) Nope, leaving tasks where they were is not a bad thing. Lots of stuff likes the scheduler best when it leaves them the hell alone :) That works out well all around, balance cycles are spent in userspace instead, scheduler produces wins by doing nothing, perfect. -Mike