On Wed, May 14, 2025 at 01:58:40PM -0400, Yury Norov wrote: > On Wed, May 14, 2025 at 05:26:45PM +0000, Michael Kelley wrote: > > > Hope that helps. > > > > Yes, that helps! So the key to understanding "weight" is that > > NUMA locality is preferred over sibling dislocality. > > > > This is a great summary! All or most of it should go as a > > comment describing the function and what it is trying to do. > > OK, please consider applying: > > >From abdf5cc6dabd7f433b1d1e66db86333a33a2cd15 Mon Sep 17 00:00:00 2001 > From: Yury Norov [NVIDIA] <yury.no...@gmail.com> > Date: Wed, 14 May 2025 13:45:26 -0400 > Subject: [PATCH] net: mana: explain irq_setup() algorithm > > Commit 91bfe210e196 ("net: mana: add a function to spread IRQs per CPUs") > added the irq_setup() function that distributes IRQs on CPUs according > to a tricky heuristic. The corresponding commit message explains the > heuristic. > > Duplicate it in the source code to make available for readers without > digging git in history. Also, add more detailed explanation about how > the heuristics is implemented. > > Signed-off-by: Yury Norov [NVIDIA] <yury.no...@gmail.com> > --- > .../net/ethernet/microsoft/mana/gdma_main.c | 41 +++++++++++++++++++ > 1 file changed, 41 insertions(+) > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > index 4ffaf7588885..f9e8d4d1ba3a 100644 > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > @@ -1288,6 +1288,47 @@ void mana_gd_free_res_map(struct gdma_resource *r) > r->size = 0; > } > > +/* > + * Spread on CPUs with the following heuristics: > + * > + * 1. No more than one IRQ per CPU, if possible; > + * 2. NUMA locality is the second priority; > + * 3. Sibling dislocality is the last priority. > + * > + * Let's consider this topology: > + * > + * Node 0 1 > + * Core 0 1 2 3 > + * CPU 0 1 2 3 4 5 6 7 > + * > + * The most performant IRQ distribution based on the above topology > + * and heuristics may look like this: > + * > + * IRQ Nodes Cores CPUs > + * 0 1 0 0-1 > + * 1 1 1 2-3 > + * 2 1 0 0-1 > + * 3 1 1 2-3 > + * 4 2 2 4-5 > + * 5 2 3 6-7 > + * 6 2 2 4-5 > + * 7 2 3 6-7 > + * > + * The heuristics is implemented as follows. > + * > + * The outer for_each() loop resets the 'weight' to the actual number > + * of CPUs in the hop. Then inner for_each() loop decrements it by the > + * number of sibling groups (cores) while assigning first set of IRQs > + * to each group. IRQs 0 and 1 above are distributed this way. > + * > + * Now, because NUMA locality is more important, we should walk the > + * same set of siblings and assign 2nd set of IRQs (2 and 3), and it's > + * implemented by the medium while() loop. We do like this unless the > + * number of IRQs assigned on this hop will not become equal to number > + * of CPUs in the hop (weight == 0). Then we switch to the next hop and > + * do the same thing. > + */ > + > static int irq_setup(unsigned int *irqs, unsigned int len, int node) > { > const struct cpumask *next, *prev = cpu_none_mask;
Thank you Yury, I will include this patch in the patchset with the next version. > -- > 2.43.0