On Mon, 2021-03-22 at 15:33 +0000, Mel Gorman wrote: > If trying that, I would put that in a separate patch. At one point > I did play with clearing prev, target and recent but hit problems. > Initialising the mask and clearing them in select_idle_sibling() hurt > the fast path and doing it later was not much better. IIRC, the > problem > I hit was that the cost of clearing multiple CPUs before the search > was > not offset by gains from a more efficient search.
I'm definitely avoiding the more expensive operations, and am only using __cpumask_clear_cpu now :) > If I had to guess, simply initialising cpumask after calling > select_idle_smt() will be faster for your particular case because you > have a reasonable expectation that prev's SMT sibling is idle when > there > are no idle cores. Checking if prev's sibling is free when there are > no > idle cores is fairly cheap in comparison to a cpumask initialisation > and > partial clearing. > > If you have the testing capacity and time, test both. Kicking off more tests soon. I'll get back with a v3 patch on Wednesday. -- All Rights Reversed.
signature.asc
Description: This is a digitally signed message part