On 25/01/2021 11:50, Song Bao Hua (Barry Song) wrote: > >> -----Original Message----- >> From: Dietmar Eggemann [mailto:dietmar.eggem...@arm.com] >> Sent: Wednesday, January 13, 2021 12:00 AM >> To: Morten Rasmussen <morten.rasmus...@arm.com>; Tim Chen >> <tim.c.c...@linux.intel.com> >> Cc: Song Bao Hua (Barry Song) <song.bao....@hisilicon.com>; >> valentin.schnei...@arm.com; catalin.mari...@arm.com; w...@kernel.org; >> r...@rjwysocki.net; vincent.guit...@linaro.org; l...@kernel.org; >> gre...@linuxfoundation.org; Jonathan Cameron <jonathan.came...@huawei.com>; >> mi...@redhat.com; pet...@infradead.org; juri.le...@redhat.com; >> rost...@goodmis.org; bseg...@google.com; mgor...@suse.de; >> mark.rutl...@arm.com; sudeep.ho...@arm.com; aubrey...@linux.intel.com; >> linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; >> linux-a...@vger.kernel.org; linux...@openeuler.org; xuwei (O) >> <xuw...@huawei.com>; Zengtao (B) <prime.z...@hisilicon.com>; tiantao (H) >> <tiant...@hisilicon.com> >> Subject: Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters >> and >> add cluster scheduler >> >> On 11/01/2021 10:28, Morten Rasmussen wrote: >>> On Fri, Jan 08, 2021 at 12:22:41PM -0800, Tim Chen wrote: >>>> >>>> >>>> On 1/8/21 7:12 AM, Morten Rasmussen wrote: >>>>> On Thu, Jan 07, 2021 at 03:16:47PM -0800, Tim Chen wrote: >>>>>> On 1/6/21 12:30 AM, Barry Song wrote:
[...] >> wake_wide() switches between packing (select_idle_sibling(), llc_size >> CPUs) and spreading (find_idlest_cpu(), all CPUs). >> >> AFAICS, since none of the sched domains set SD_BALANCE_WAKE, currently >> all wakeups are (llc-)packed. > > Sorry for late response. I was struggling with some other topology > issues recently. > > For "all wakeups are (llc-)packed", > it seems you mean current want_affine is only affecting the new_cpu, > and for wake-up path, we will always go to select_idle_sibling() rather > than find_idlest_cpu() since nobody sets SD_WAKE_BALANCE in any > sched_domain ? > >> >> select_task_rq_fair() >> >> for_each_domain(cpu, tmp) >> >> if (tmp->flags & sd_flag) >> sd = tmp; >> >> >> In case we would like to further distinguish between llc-packing and >> even narrower (cluster or MC-L2)-packing, we would introduce a 2. level >> packing vs. spreading heuristic further down in sis(). > > I didn't get your point on "2 level packing". Would you like > to describe more? It seems you mean we need to have separate > calculation for avg_scan_cost and sched_feat(SIS_) for cluster > (or MC-L2) since cluster and llc are not in the same level > physically? By '1. level packing' I meant going sis() (i.e. sd=per_cpu(sd_llc, target)) instead of routing WF_TTWU through find_idlest_cpu() which uses a broader sd span (in case all sd's (or at least up to an sd > llc) would have SD_BALANCE_WAKE set). wake_wide() (wakee/waker flip heuristic) is currently used to make this decision. But since no sd sets SD_BALANCE_WAKE we always go sis() for WF_TTWU. '2. level packing' would be the decision between cluster- and llc-packing. The question was which heuristic could be used here. >> IMHO, Barry's current implementation doesn't do this right now. Instead >> he's trying to pack on cluster first and if not successful look further >> among the remaining llc CPUs for an idle CPU. > > Yes. That is exactly what the current patch is doing. And this will be favoring cluster- over llc-packing for each task instead.