March 4, 2026 at 16:20, "Daniel Borkmann" <[email protected] 
mailto:[email protected]?to=%22Daniel%20Borkmann%22%20%3Cdaniel%40iogearbox.net%3E
 > wrote:


> 
> On 3/4/26 8:42 AM, Jiayuan Chen wrote:
> 
> > 
> > From: Jiayuan Chen <[email protected]>
> >  bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL
> >  check. rr_tx_counter is a per-CPU counter only allocated in bond_open()
> >  when the bond mode is round-robin. If the bond device was never brought
> >  up, rr_tx_counter remains NULL, causing a null-ptr-deref.
> >  The XDP redirect path can reach this code even when the bond is not up:
> >  bpf_master_redirect_enabled_key is a global static key, so when any bond
> >  device has native XDP attached, the XDP_TX -> xdp_master_redirect()
> >  interception is enabled for all bond slaves system-wide. This allows the
> >  path xdp_master_redirect() -> bond_xdp_get_xmit_slave() ->
> >  bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be
> >  reached on a bond that was never opened.
> >  Fix this by allocating rr_tx_counter unconditionally in bond_init()
> >  (ndo_init), which is called by register_netdevice() and covers both
> >  device creation paths (bond_create() and bond_newlink()). This also
> >  handles the case where bond mode is changed to round-robin after device
> >  creation. The conditional allocation in bond_open() is removed. Since
> >  bond_destructor() already unconditionally calls
> >  free_percpu(bond->rr_tx_counter), the lifecycle is clean: allocate at
> >  ndo_init, free at destructor.
> >  Note: rr_tx_counter is only used by round-robin mode, so this
> >  deliberately allocates a per-cpu u32 that goes unused for other modes.
> >  Conditional allocation (e.g., in bond_option_mode_set) was considered
> >  but rejected: the XDP path can race with mode changes on a downed bond,
> >  and adding memory barriers to the XDP hot path is not justified for
> >  saving 4 bytes per CPU.
> > 
> Arguably it's a corner case, but could we not just do sth like this to
> actually check if the device is up and if not drop?
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index ba019ded773d..c447fd989a27 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -4387,6 +4387,9 @@ u32 xdp_master_redirect(struct xdp_buff *xdp)
>  struct net_device *master, *slave;
>  master = netdev_master_upper_dev_get_rcu(xdp->rxq->dev);
> + if (unlikely(!(master->flags & IFF_UP)))
> + return XDP_ABORTED;
> +
>  slave = master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp);
>  if (slave && slave != xdp->rxq->dev) {
>  /* The target device is different from the receiving device, so
>

Hi Daniel,

It was discussed at [1].

The primary concern at the time was that assuming
bond->rr_tx_counter had been allocated based on the
interface being up was too implicit.

[1] 
https://lore.kernel.org/netdev/[email protected]/T/#t

Feel free to share any other thoughts or ideas

Thanks,

Reply via email to