March 4, 2026 at 16:20, "Daniel Borkmann" <[email protected] mailto:[email protected]?to=%22Daniel%20Borkmann%22%20%3Cdaniel%40iogearbox.net%3E > wrote:
> > On 3/4/26 8:42 AM, Jiayuan Chen wrote: > > > > > From: Jiayuan Chen <[email protected]> > > bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL > > check. rr_tx_counter is a per-CPU counter only allocated in bond_open() > > when the bond mode is round-robin. If the bond device was never brought > > up, rr_tx_counter remains NULL, causing a null-ptr-deref. > > The XDP redirect path can reach this code even when the bond is not up: > > bpf_master_redirect_enabled_key is a global static key, so when any bond > > device has native XDP attached, the XDP_TX -> xdp_master_redirect() > > interception is enabled for all bond slaves system-wide. This allows the > > path xdp_master_redirect() -> bond_xdp_get_xmit_slave() -> > > bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be > > reached on a bond that was never opened. > > Fix this by allocating rr_tx_counter unconditionally in bond_init() > > (ndo_init), which is called by register_netdevice() and covers both > > device creation paths (bond_create() and bond_newlink()). This also > > handles the case where bond mode is changed to round-robin after device > > creation. The conditional allocation in bond_open() is removed. Since > > bond_destructor() already unconditionally calls > > free_percpu(bond->rr_tx_counter), the lifecycle is clean: allocate at > > ndo_init, free at destructor. > > Note: rr_tx_counter is only used by round-robin mode, so this > > deliberately allocates a per-cpu u32 that goes unused for other modes. > > Conditional allocation (e.g., in bond_option_mode_set) was considered > > but rejected: the XDP path can race with mode changes on a downed bond, > > and adding memory barriers to the XDP hot path is not justified for > > saving 4 bytes per CPU. > > > Arguably it's a corner case, but could we not just do sth like this to > actually check if the device is up and if not drop? > > diff --git a/net/core/filter.c b/net/core/filter.c > index ba019ded773d..c447fd989a27 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -4387,6 +4387,9 @@ u32 xdp_master_redirect(struct xdp_buff *xdp) > struct net_device *master, *slave; > master = netdev_master_upper_dev_get_rcu(xdp->rxq->dev); > + if (unlikely(!(master->flags & IFF_UP))) > + return XDP_ABORTED; > + > slave = master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp); > if (slave && slave != xdp->rxq->dev) { > /* The target device is different from the receiving device, so > Hi Daniel, It was discussed at [1]. The primary concern at the time was that assuming bond->rr_tx_counter had been allocated based on the interface being up was too implicit. [1] https://lore.kernel.org/netdev/[email protected]/T/#t Feel free to share any other thoughts or ideas Thanks,

