Excerpts from Peter Zijlstra's message of May 19, 2021 7:59 pm: > On Tue, May 18, 2021 at 12:07:40PM -0700, Ricardo Neri wrote: >> On Fri, May 14, 2021 at 07:14:15PM -0700, Ricardo Neri wrote: >> > On Fri, May 14, 2021 at 11:47:45AM +0200, Peter Zijlstra wrote: > >> > > So I'm thinking that this is a property of having ASYM_PACKING at a core >> > > level, rather than some arch special. Wouldn't something like this be >> > > more appropriate? > >> > Thanks Peter for the quick review! This makes sense to me. The only >> > reason we proposed arch_asym_check_smt_siblings() is because we were >> > about breaking powerpc (I need to study how they set priorities for SMT, >> > if applicable). If you think this is not an issue I can post a >> > v4 with this update. >> >> As far as I can see, priorities in powerpc are set by the CPU number. >> However, I am not sure how CPUs are enumerated? If CPUs in brackets are >> SMT sibling, Does an enumeration looks like A) [0, 1], [2, 3] or B) [0, 2], >> [1, 3]? I guess B is the right answer. Otherwise, both SMT siblings of a >> core would need to be busy before a new core is used. >> >> Still, I think the issue described in the cover letter may be >> reproducible in powerpc as well. If CPU3 is offlined, and [0, 2] pulled >> tasks from [1, -] so that both CPU0 and CPU2 become busy, CPU1 would not be >> able to help since CPU0 has the highest priority. >> >> I am cc'ing the linuxppc list to get some feedback. > > IIRC the concern with Power is that their Cores can go faster if the > higher SMT siblings are unused. > > That is, suppose you have an SMT4 Core with only a single active task, > then if only SMT0 is used it can reach max performance, but if the > active sibling is SMT1 it can not reach max performance, and if the only > active sibling is SMT2 it goes slower still. > > So they need to pack the tasks to the lowest SMT siblings, and have the > highest SMT siblings idle (where possible) in order to increase > performance.
That's correct. Thanks, Nick