[PATCH 2/4 V3 net-next] cpumask: define cleanup function for cpumasks

2024-01-28 Thread Souradeep Chakrabarti
From: Yury Norov Now we can simplify code that allocates cpumasks for local needs. Signed-off-by: Yury Norov --- include/linux/cpumask.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 228c23eb36d2..1c29947db848 100644 --- a/includ

[PATCH 4/4 V3 net-next] net: mana: Assigning IRQ affinity on HT cores

2024-01-28 Thread Souradeep Chakrabarti
Existing MANA design assigns IRQ to every CPU, including sibling hyper-threads. This may cause multiple IRQs to be active simultaneously in the same core and may reduce the network performance. Improve the performance by assigning IRQ to non sibling CPUs in local NUMA node. The performance improve

[PATCH 1/4 V3 net-next] cpumask: add cpumask_weight_andnot()

2024-01-28 Thread Souradeep Chakrabarti
From: Yury Norov Similarly to cpumask_weight_and(), cpumask_weight_andnot() is a handy helper that may help to avoid creating an intermediate mask just to calculate number of bits that set in a 1st given mask, and clear in 2nd one. Signed-off-by: Yury Norov Reviewed-by: Jacob Keller --- inclu

[PATCH 3/4 V3 net-next] net: mana: add a function to spread IRQs per CPUs

2024-01-28 Thread Souradeep Chakrabarti
From: Yury Norov Souradeep investigated that the driver performs faster if IRQs are spread on CPUs with the following heuristics: 1. No more than one IRQ per CPU, if possible; 2. NUMA locality is the second priority; 3. Sibling dislocality is the last priority. Let's consider this topology: No

[PATCH 0/4 V3 net-next] net: mana: Assigning IRQ affinity on HT cores

2024-01-28 Thread Souradeep Chakrabarti
This patch set introduces a new helper function irq_setup(), to optimize IRQ distribution for MANA network devices. The patch set makes the driver working 15% faster than with cpumask_local_spread(). Souradeep Chakrabarti (1): net: mana: Assigning IRQ affinity on HT cores Yury Norov (3): cpum