On Thu, Dec 10, 2020 at 09:18:22PM +0100, Thomas Gleixner wrote: > Prarit reported that depending on the affinity setting the > > ' irq $N: Affinity broken due to vector space exhaustion.' > > message is showing up in dmesg, but the vector space on the CPUs in the > affinity mask is definitely not exhausted. > > Shung-Hsi provided traces and analysis which pinpoints the problem: > > The ordering of trying to assign an interrupt vector in > assign_irq_vector_any_locked() is simply wrong if the interrupt data has a > valid node assigned. It does: > > 1) Try the intersection of affinity mask and node mask > 2) Try the node mask > 3) Try the full affinity mask > 4) Try the full online mask > > Obviously #2 and #3 are in the wrong order as the requested affinity > mask has to take precedence. > > In the observed cases #1 failed because the affinity mask did not contain > CPUs from node 0. That made it allocate a vector from node 0, thereby > breaking affinity and emitting the misleading message. > > Revert the order of #2 and #3 so the full affinity mask without the node > intersection is tried before actually affinity is broken. > > If no node is assigned then only the full affinity mask and if that fails > the full online mask is tried. > > Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor") > Reported-by: Shung-Hsi Yu <shung-hsi...@suse.com> > Reported-by: Prarit Bhargava <pra...@redhat.com> > Signed-off-by: Thomas Gleixner <t...@linutronix.de> > Tested-by: Shung-Hsi Yu <shung-hsi...@suse.com> > Cc: sta...@vger.kernel.org > --- > arch/x86/kernel/apic/vector.c | 24 ++++++++++++++---------- > 1 file changed, 14 insertions(+), 10 deletions(-) > > --- a/arch/x86/kernel/apic/vector.c > +++ b/arch/x86/kernel/apic/vector.c > @@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked( > const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd); > int node = irq_data_get_node(irqd); > > - if (node == NUMA_NO_NODE) > - goto all; > - /* Try the intersection of @affmsk and node mask */ > - cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk); > - if (!assign_vector_locked(irqd, vector_searchmask)) > - return 0; > - /* Try the node mask */ > - if (!assign_vector_locked(irqd, cpumask_of_node(node))) > - return 0; > -all: > + if (node != NUMA_NO_NODE) { > + /* Try the intersection of @affmsk and node mask */ > + cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk); > + if (!assign_vector_locked(irqd, vector_searchmask)) > + return 0; > + } > + > /* Try the full affinity mask */ > cpumask_and(vector_searchmask, affmsk, cpu_online_mask); > if (!assign_vector_locked(irqd, vector_searchmask)) > return 0; > + > + if (node != NUMA_NO_NODE) { > + /* Try the node mask */ > + if (!assign_vector_locked(irqd, cpumask_of_node(node))) > + return 0; > + } > + > /* Try the full online mask */ > return assign_vector_locked(irqd, cpu_online_mask); > } >
Reviewed-by: Ming Lei <ming....@redhat.com> Thanks, Ming