On 04/24/2017 10:03 AM, Peter Zijlstra wrote: > On Thu, Apr 20, 2017 at 04:51:43PM -0300, Lauro Ramos Venancio wrote: > >> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c >> index e77c93a..694e799 100644 >> --- a/kernel/sched/topology.c >> +++ b/kernel/sched/topology.c >> @@ -505,7 +507,11 @@ static void build_group_mask(struct sched_domain *sd, >> struct sched_group *sg) >> >> for_each_cpu(i, sg_span) { >> sibling = *per_cpu_ptr(sdd->sd, i); >> - if (!cpumask_test_cpu(i, sched_domain_span(sibling))) >> + >> + if (!sibling->groups) >> + continue; > How can this happen? This happens on machines with asymmetric topologies. For more details see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1174876874dcf8986806e4dad3d7d07af20b439 > >> + >> + if (!cpumask_equal(sg_span, sched_group_cpus(sibling->groups))) >> continue; >> >> cpumask_set_cpu(i, sched_group_mask(sg)); > >> @@ -1482,6 +1502,14 @@ struct sched_domain *build_sched_domain(struct >> sched_domain_topology_level *tl, >> } >> } >> >> + /* Init overlap groups */ >> + for_each_cpu(i, cpu_map) { >> + for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) { >> + if (sd->flags & SD_OVERLAP) >> + init_overlap_sched_groups(sd); >> + } >> + } > Why does this have to be a whole new loop? This is because in > build_group_mask() we could encounter @sibling that were not constructed > yet? That is right. We can only build the group mask when all siblings are constructed.
> > So this is the primary fix? Yes. > >> + >> /* Calculate CPU capacity for physical packages and nodes */ >> for (i = nr_cpumask_bits-1; i >= 0; i--) { >> if (!cpumask_test_cpu(i, cpu_map)) > > Also, would it not make sense to re-order patch 2 to come after this, > such that we _do_ have the group_mask available and don't have to jump > through hoops in order to link up the sgc? Afaict we don't actually use > the sgc until the above (reverse) loop computing the CPU capacities.