Le 15/05/2026 à 22:19, Ilya Maximets a écrit :
> For notifications with NETLINK_LISTEN_ALL_NSID the expected behavior
> is the following:
>
> - if NSID is not reported, then the event is local to the listener.
> - if NSID is reported, then the event is remote, i.e., originated in
> the provided namespace that is not the same as the listener's.
>
> Userspace applications like ovs-vswitchd expect this behavior. And
> ip monitor uses this logic for printing out [nsid current] vs [nsid N].
>
> However, when a self-referential NSID is allocated for a namespace,
> every local notification starts sending this ID to userspace as part
> of NETLINK_LISTEN_ALL_NSID CMSG metadata.
>
> This is problematic, because the listener cannot tell if those
> notifications are local or not anymore without making extra requests
> to figure out if the provided NSID is local or not. The listener
> can also not figure out the local NSID beforehand as it can be
> allocated at any point in time by other processes.
>
> The value is practically not useful, since it's the namespace's own
> ID that the application has to obtain from other sources in order to
> figure out if it's the same or not. So, for the application it's
> just an extra busy work with no benefits. Moreover, applications
> that do not know about this quirk may be mishandling notifications
> with NSID set as notifications from remote namespaces while they
> are actually local. This is the case with ovs-vswitchd.
>
> Having a self-referential NSID mapping is not something that happens
> under normal circumstances, but it can be a case in specific
> environments. And it can be more common with certain container
> runtimes like LXC/LXD/Incus that unintentionally trigger allocation
> of the self-referential NSID via cross-namespace RTM_GETLINK requests.
It is easy to allocate a self-nsid:
$ ip netns attach current $$
$ ip netns set current auto
$ ip netns list-id
nsid 0 (iproute2 netns name: current)
An application should be prepared to handle this (it is easy for an app to get
the 'self-nsid' value).
>
> A search though open-source projects doesn't reveal any projects
> that use NETNSA_NSID_NOT_ASSIGNED and rely on metadata to contain
> self-referential NSIDs. Quite the opposite, ovs-vswitchd relies
> on the metadata to not be present to separate local and remote
> events. And the 'ip monitor' relies on the metadata to not be present
> to show '[nsid current]', though this is more like "print 'current'
> if there is nothing to print" situation, but still can be a little
> confusing for the user to see an ID for a local event.
We (6WIND) are using NETLINK_LISTEN_ALL_NSID. Like iproute2, 'current' is
assumed if there is no nsid, else the corresponding netns is checked (which may
match the current netns).
>
> Fixes: 59324cf35aba ("netlink: allow to listen "all" netns")
> Reported-by: Matteo Perin <[email protected]>
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
> net/netlink/af_netlink.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 2aeb0680807d6..607ab4e4ac697 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -1482,9 +1482,11 @@ static void do_one_broadcast(struct sock *sk,
> p->skb2 = NULL;
> goto out;
> }
> - NETLINK_CB(p->skb2).nsid = peernet2id(sock_net(sk), p->net);
> - if (NETLINK_CB(p->skb2).nsid != NETNSA_NSID_NOT_ASSIGNED)
> - NETLINK_CB(p->skb2).nsid_is_set = true;
> + if (!net_eq(sock_net(sk), p->net)) {
> + NETLINK_CB(p->skb2).nsid = peernet2id(sock_net(sk), p->net);
> + if (NETLINK_CB(p->skb2).nsid != NETNSA_NSID_NOT_ASSIGNED)
> + NETLINK_CB(p->skb2).nsid_is_set = true;
> + }
> val = netlink_broadcast_deliver(sk, p->skb2);
> if (val < 0) {
> netlink_overrun(sk);