On Wed, Mar 04, 2026 at 03:42:57PM +0800, Jiayuan Chen wrote: > From: Jiayuan Chen <[email protected]> > > bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL > check. rr_tx_counter is a per-CPU counter only allocated in bond_open() > when the bond mode is round-robin. If the bond device was never brought > up, rr_tx_counter remains NULL, causing a null-ptr-deref. > > The XDP redirect path can reach this code even when the bond is not up: > bpf_master_redirect_enabled_key is a global static key, so when any bond > device has native XDP attached, the XDP_TX -> xdp_master_redirect() > interception is enabled for all bond slaves system-wide. This allows the > path xdp_master_redirect() -> bond_xdp_get_xmit_slave() -> > bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be > reached on a bond that was never opened. > > Fix this by allocating rr_tx_counter unconditionally in bond_init() > (ndo_init), which is called by register_netdevice() and covers both > device creation paths (bond_create() and bond_newlink()). This also > handles the case where bond mode is changed to round-robin after device > creation. The conditional allocation in bond_open() is removed. Since > bond_destructor() already unconditionally calls > free_percpu(bond->rr_tx_counter), the lifecycle is clean: allocate at > ndo_init, free at destructor. > > Note: rr_tx_counter is only used by round-robin mode, so this > deliberately allocates a per-cpu u32 that goes unused for other modes. > Conditional allocation (e.g., in bond_option_mode_set) was considered > but rejected: the XDP path can race with mode changes on a downed bond, > and adding memory barriers to the XDP hot path is not justified for > saving 4 bytes per CPU. > > Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave > device") > Reported-by: [email protected] > Closes: > https://lore.kernel.org/all/[email protected]/T/ > Signed-off-by: Jiayuan Chen <[email protected]> > --- > drivers/net/bonding/bond_main.c | 19 +++++++++++++------ > 1 file changed, 13 insertions(+), 6 deletions(-) >
IMO it's not worth it to waste memory in all modes, for an unpopular mode. I think it'd be better to add a null check in bond_rr_gen_slave_id(), READ/WRITE_ONCE() should be enough since it is allocated only once, and freed when the xmit code cannot be reachable anymore (otherwise we'd have more bugs now). The branch will be successfully predicted practically always, and you can also mark the ptr being null as unlikely. That way only RR takes a very minimal hit, if any. Cheers, Nik

