On 2025-01-23 1:18 p.m., Joe Damato wrote:

Sorry for the late reply. I was pulled into other stuff and next was closed anyway.

On Fri, Jan 17, 2025 at 05:33:32PM -0700, Ahmed Zaki wrote:
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the
user's settings to the IRQs.

Add CPU affinity mask to napi_config. To delegate the CPU affinity
management to the core, drivers must:
  1 - set the new netdev flag "irq_affinity_auto":
                                        netif_enable_irq_affinity(netdev)
  2 - create the napi with persistent config:
                                        netif_napi_add_config()
  3 - bind an IRQ to the napi instance: netif_napi_set_irq()

the core will then make sure to use re-assign affinity to the napi's
IRQ.

The default IRQ mask is set to one cpu starting from the closest NUMA.

Maybe the above is helpful to add to
Documentation/networking/napi.rst ?

Signed-off-by: Ahmed Zaki <ahmed.z...@intel.com>
---
  include/linux/netdevice.h | 14 ++++++++++-
  net/core/dev.c            | 51 +++++++++++++++++++++++++++++----------
  2 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 98259f19c627..d576e5c91c43 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -351,6 +351,7 @@ struct napi_config {
        u64 gro_flush_timeout;
        u64 irq_suspend_timeout;
        u32 defer_hard_irqs;
+       cpumask_t affinity_mask;
        unsigned int napi_id;
  };
@@ -393,8 +394,8 @@ struct napi_struct {
        struct list_head        dev_list;
        struct hlist_node       napi_hash_node;
        int                     irq;
-#ifdef CONFIG_RFS_ACCEL
        struct irq_affinity_notify notify;
+#ifdef CONFIG_RFS_ACCEL
        int                     napi_rmap_idx;
  #endif
        int                     index;
@@ -1991,6 +1992,11 @@ enum netdev_reg_state {
   *
   *    @threaded:      napi threaded mode is enabled
   *
+ *     @irq_affinity_auto: driver wants the core to manage the IRQ affinity.
+ *                         Set by netif_enable_irq_affinity(), then driver must
+ *                         create persistent napi by netif_napi_add_config()
+ *                         and finally bind napi to IRQ (netif_napi_set_irq).
+ *
   *    @rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap.
   *                       Set by calling netif_enable_cpu_rmap().
   *
@@ -2401,6 +2407,7 @@ struct net_device {
        struct lock_class_key   *qdisc_tx_busylock;
        bool                    proto_down;
        bool                    threaded;
+       bool                    irq_affinity_auto;
        bool                    rx_cpu_rmap_auto;
/* priv_flags_slow, ungrouped to save space */
@@ -2653,6 +2660,11 @@ static inline void netdev_set_ml_priv(struct net_device 
*dev,
        dev->ml_priv_type = type;
  }
+static inline void netif_enable_irq_affinity(struct net_device *dev)
+{
+       dev->irq_affinity_auto = true;
+}

I'll have to look at the patches which use the above function, but
the first thing that came to mind when I saw this was does the above
need a WRITE_ONCE ?


Why?

The reads below seem to be protected by a lock; I haven't yet looked
at the other patches so maybe the write is also protected by
netdev->lock ?

All functions are protected by netdev_lock/unlock. Except notifier callback (netif_napi_irq_notify) which is protected by the fact that all napis are disabled (so IRQ notifier is removed by cancel_work_sync) before the napis are deleted.

Other nits are OK IMO and will be fixed in the next version.

Thanks.

Reply via email to