On Thu, Nov 27, 2025 at 03:25:32PM +0100, Stefano Garzarella wrote:
> On Wed, Nov 26, 2025 at 11:47:31PM -0800, Bobby Eshleman wrote:
> > From: Bobby Eshleman <[email protected]>
> > 
> > Add netns logic to vsock core. Additionally, modify transport hook
> > prototypes to be used by later transport-specific patches (e.g.,
> > *_seqpacket_allow()).
> > 
> > Namespaces are supported primarily by changing socket lookup functions
> > (e.g., vsock_find_connected_socket()) to take into account the socket
> > namespace and the namespace mode before considering a candidate socket a
> > "match".
> > 
> > This patch also introduces the sysctl /proc/sys/net/vsock/ns_mode that
> > accepts the "global" or "local" mode strings.
> > 
> > Add netns functionality (initialization, passing to transports, procfs,
> > etc...) to the af_vsock socket layer. Later patches that add netns
> > support to transports depend on this patch.
> > 
> > dgram_allow(), stream_allow(), and seqpacket_allow() callbacks are
> > modified to take a vsk in order to perform logic on namespace modes. In
> > future patches, the net and net_mode will also be used for socket
> > lookups in these functions.
> > 
> > Signed-off-by: Bobby Eshleman <[email protected]>
> > ---
> > Changes in v12:
> > - return true in dgram_allow(), stream_allow(), and seqpacket_allow()
> >  only if net_mode == VSOCK_NET_MODE_GLOBAL (Stefano)
> > - document bind(VMADDR_CID_ANY) case in af_vsock.c (Stefano)
> > - change order of stream_allow() call in vmci so we can pass vsk
> >  to it
> > 
> > Changes in v10:
> > - add file-level comment about what happens to sockets/devices
> >  when the namespace mode changes (Stefano)
> > - change the 'if (write)' boolean in vsock_net_mode_string() to
> >  if (!write), this simplifies a later patch which adds "goto"
> >  for mutex unlocking on function exit.
> > 
> > Changes in v9:
> > - remove virtio_vsock_alloc_rx_skb() (Stefano)
> > - remove vsock_global_dummy_net, not needed as net=NULL +
> >  net_mode=VSOCK_NET_MODE_GLOBAL achieves identical result
> > 
> > Changes in v7:
> > - hv_sock: fix hyperv build error
> > - explain why vhost does not use the dummy
> > - explain usage of __vsock_global_dummy_net
> > - explain why VSOCK_NET_MODE_STR_MAX is 8 characters
> > - use switch-case in vsock_net_mode_string()
> > - avoid changing transports as much as possible
> > - add vsock_find_{bound,connected}_socket_net()
> > - rename `vsock_hdr` to `sysctl_hdr`
> > - add virtio_vsock_alloc_linear_skb() wrapper for setting dummy net and
> >  global mode for virtio-vsock, move skb->cb zero-ing into wrapper
> > - explain seqpacket_allow() change
> > - move net setting to __vsock_create() instead of vsock_create() so
> >  that child sockets also have their net assigned upon accept()
> > 
> > Changes in v6:
> > - unregister sysctl ops in vsock_exit()
> > - af_vsock: clarify description of CID behavior
> > - af_vsock: fix buf vs buffer naming, and length checking
> > - af_vsock: fix length checking w/ correct ctl_table->maxlen
> > 
> > Changes in v5:
> > - vsock_global_net() -> vsock_global_dummy_net()
> > - update comments for new uAPI
> > - use /proc/sys/net/vsock/ns_mode instead of /proc/net/vsock_ns_mode
> > - add prototype changes so patch remains compilable
> > ---
> > drivers/vhost/vsock.c                   |   9 +-
> > include/linux/virtio_vsock.h            |   4 +-
> > include/net/af_vsock.h                  |  13 +-
> > net/vmw_vsock/af_vsock.c                | 272 
> > +++++++++++++++++++++++++++++---
> > net/vmw_vsock/hyperv_transport.c        |   7 +-
> > net/vmw_vsock/virtio_transport.c        |   9 +-
> > net/vmw_vsock/virtio_transport_common.c |   6 +-
> > net/vmw_vsock/vmci_transport.c          |  26 ++-
> > net/vmw_vsock/vsock_loopback.c          |   8 +-
> > 9 files changed, 310 insertions(+), 44 deletions(-)
> > 
> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > index ae01457ea2cd..83937e1d63fa 100644
> > --- a/drivers/vhost/vsock.c
> > +++ b/drivers/vhost/vsock.c
> > @@ -404,7 +404,8 @@ static bool vhost_transport_msgzerocopy_allow(void)
> >     return true;
> > }
> > 
> > -static bool vhost_transport_seqpacket_allow(u32 remote_cid);
> > +static bool vhost_transport_seqpacket_allow(struct vsock_sock *vsk,
> > +                                       u32 remote_cid);
> > 
> > static struct virtio_transport vhost_transport = {
> >     .transport = {
> > @@ -460,11 +461,15 @@ static struct virtio_transport vhost_transport = {
> >     .send_pkt = vhost_transport_send_pkt,
> > };
> > 
> > -static bool vhost_transport_seqpacket_allow(u32 remote_cid)
> > +static bool vhost_transport_seqpacket_allow(struct vsock_sock *vsk,
> > +                                       u32 remote_cid)
> > {
> >     struct vhost_vsock *vsock;
> >     bool seqpacket_allow = false;
> > 
> > +   if (vsk->net_mode != VSOCK_NET_MODE_GLOBAL)
> > +           return false;
> > +
> >     rcu_read_lock();
> >     vsock = vhost_vsock_get(remote_cid);
> > 
> > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> > index 0c67543a45c8..1845e8d4f78d 100644
> > --- a/include/linux/virtio_vsock.h
> > +++ b/include/linux/virtio_vsock.h
> > @@ -256,10 +256,10 @@ void virtio_transport_notify_buffer_size(struct 
> > vsock_sock *vsk, u64 *val);
> > 
> > u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
> > bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
> > -bool virtio_transport_stream_allow(u32 cid, u32 port);
> > +bool virtio_transport_stream_allow(struct vsock_sock *vsk, u32 cid, u32 
> > port);
> > int virtio_transport_dgram_bind(struct vsock_sock *vsk,
> >                             struct sockaddr_vm *addr);
> > -bool virtio_transport_dgram_allow(u32 cid, u32 port);
> > +bool virtio_transport_dgram_allow(struct vsock_sock *vsk, u32 cid, u32 
> > port);
> > 
> > int virtio_transport_connect(struct vsock_sock *vsk);
> > 
> > diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
> > index 9b5bdd083b6f..d10e73cd7413 100644
> > --- a/include/net/af_vsock.h
> > +++ b/include/net/af_vsock.h
> > @@ -126,7 +126,7 @@ struct vsock_transport {
> >                          size_t len, int flags);
> >     int (*dgram_enqueue)(struct vsock_sock *, struct sockaddr_vm *,
> >                          struct msghdr *, size_t len);
> > -   bool (*dgram_allow)(u32 cid, u32 port);
> > +   bool (*dgram_allow)(struct vsock_sock *vsk, u32 cid, u32 port);
> > 
> >     /* STREAM. */
> >     /* TODO: stream_bind() */
> > @@ -138,14 +138,14 @@ struct vsock_transport {
> >     s64 (*stream_has_space)(struct vsock_sock *);
> >     u64 (*stream_rcvhiwat)(struct vsock_sock *);
> >     bool (*stream_is_active)(struct vsock_sock *);
> > -   bool (*stream_allow)(u32 cid, u32 port);
> > +   bool (*stream_allow)(struct vsock_sock *vsk, u32 cid, u32 port);
> > 
> >     /* SEQ_PACKET. */
> >     ssize_t (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
> >                                  int flags);
> >     int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
> >                              size_t len);
> > -   bool (*seqpacket_allow)(u32 remote_cid);
> > +   bool (*seqpacket_allow)(struct vsock_sock *vsk, u32 remote_cid);
> >     u32 (*seqpacket_has_data)(struct vsock_sock *vsk);
> > 
> >     /* Notification. */
> > @@ -218,6 +218,13 @@ void vsock_remove_connected(struct vsock_sock *vsk);
> > struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr);
> > struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
> >                                      struct sockaddr_vm *dst);
> > +struct sock *vsock_find_bound_socket_net(struct sockaddr_vm *addr,
> > +                                    struct net *net,
> > +                                    enum vsock_net_mode net_mode);
> > +struct sock *vsock_find_connected_socket_net(struct sockaddr_vm *src,
> > +                                        struct sockaddr_vm *dst,
> > +                                        struct net *net,
> > +                                        enum vsock_net_mode net_mode);
> > void vsock_remove_sock(struct vsock_sock *vsk);
> > void vsock_for_each_connected_socket(struct vsock_transport *transport,
> >                                  void (*fn)(struct sock *sk));
> > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> > index adcba1b7bf74..6113c22db8dc 100644
> > --- a/net/vmw_vsock/af_vsock.c
> > +++ b/net/vmw_vsock/af_vsock.c
> > @@ -83,6 +83,46 @@
> >  *   TCP_ESTABLISHED - connected
> >  *   TCP_CLOSING - disconnecting
> >  *   TCP_LISTEN - listening
> > + *
> > + * - Namespaces in vsock support two different modes configured
> > + *   through /proc/sys/net/vsock/ns_mode. The modes are "local" and 
> > "global".
> > + *   Each mode defines how the namespace interacts with CIDs.
> > + *   /proc/sys/net/vsock/ns_mode is write-once, so that it may be 
> > configured
> > + *   and locked down by a namespace manager. The default is "global". The 
> > mode
> > + *   is set per-namespace.
> > + *
> > + *   The modes affect the allocation and accessibility of CIDs as follows:
> > + *
> > + *   - global - access and allocation are all system-wide
> 
> nit: maybe we should mention that this mode is primarily for backward
> compatibility, since it's the way how vsock worked before netns support.
> 
> (We can fix later eventually with a followup patch)
> 
> > + *      - all CID allocation from global namespaces draw from the same
> > + *        system-wide pool.
> > + *      - if one global namespace has already allocated some CID, another
> > + *        global namespace will not be able to allocate the same CID.
> > + *      - global mode AF_VSOCK sockets can reach any VM or socket in any 
> > global
> > + *        namespace, they are not contained to only their own namespace.
> > + *      - AF_VSOCK sockets in a global mode namespace cannot reach VMs or
> > + *        sockets in any local mode namespace.
> > + *   - local - access and allocation are contained within the namespace
> > + *     - CID allocation draws only from a private pool local only to the
> > + *       namespace, and does not affect the CIDs available for allocation 
> > in any
> > + *       other namespace (global or local).
> > + *     - VMs in a local namespace do not collide with CIDs in any other 
> > local
> > + *       namespace or any global namespace. For example, if a VM in a 
> > local mode
> > + *       namespace is given CID 10, then CID 10 is still available for
> > + *       allocation in any other namespace, but not in the same namespace.
> > + *     - AF_VSOCK sockets in a local mode namespace can connect only to 
> > VMs or
> > + *       other sockets within their own namespace.
> > + *     - sockets bound to VMADDR_CID_ANY in local namespaces will never 
> > resolve
> > + *       to any transport that is not compatible with local mode. There is 
> > no
> > + *       error that propagates to the user (as there is for connection 
> > attempts)
> > + *       because it is possible for some packet to reach this socket from
> > + *       a different transport that *does* support local mode. For
> > + *       example, virtio-vsock may not support local mode, but the socket
> > + *       may still accept a connection from vhost-vsock which does.
> > + *
> > + *   - when a socket or device is initialized in a namespace with mode
> > + *     global, it will stay in global mode even if the namespace later
> > + *     changes to local.
> >  */
> > 
> > #include <linux/compat.h>
> > @@ -100,6 +140,7 @@
> > #include <linux/module.h>
> > #include <linux/mutex.h>
> > #include <linux/net.h>
> > +#include <linux/proc_fs.h>
> > #include <linux/poll.h>
> > #include <linux/random.h>
> > #include <linux/skbuff.h>
> > @@ -111,9 +152,18 @@
> > #include <linux/workqueue.h>
> > #include <net/sock.h>
> > #include <net/af_vsock.h>
> > +#include <net/netns/vsock.h>
> > #include <uapi/linux/vm_sockets.h>
> > #include <uapi/asm-generic/ioctls.h>
> > 
> > +#define VSOCK_NET_MODE_STR_GLOBAL "global"
> > +#define VSOCK_NET_MODE_STR_LOCAL "local"
> > +
> > +/* 6 chars for "global", 1 for null-terminator, and 1 more for '\n'.
> > + * The newline is added by proc_dostring() for read operations.
> > + */
> > +#define VSOCK_NET_MODE_STR_MAX 8
> > +
> > static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr);
> > static void vsock_sk_destruct(struct sock *sk);
> > static int vsock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
> > @@ -235,33 +285,47 @@ static void __vsock_remove_connected(struct 
> > vsock_sock *vsk)
> >     sock_put(&vsk->sk);
> > }
> > 
> > -static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr)
> > +static struct sock *__vsock_find_bound_socket_net(struct sockaddr_vm *addr,
> > +                                             struct net *net,
> > +                                             enum vsock_net_mode net_mode)
> > {
> >     struct vsock_sock *vsk;
> > 
> >     list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table) {
> > -           if (vsock_addr_equals_addr(addr, &vsk->local_addr))
> > -                   return sk_vsock(vsk);
> > +           struct sock *sk = sk_vsock(vsk);
> > +
> > +           if (vsock_addr_equals_addr(addr, &vsk->local_addr) &&
> > +               vsock_net_check_mode(sock_net(sk), vsk->net_mode, net,
> > +                                    net_mode))
> > +                   return sk;
> > 
> >             if (addr->svm_port == vsk->local_addr.svm_port &&
> >                 (vsk->local_addr.svm_cid == VMADDR_CID_ANY ||
> > -                addr->svm_cid == VMADDR_CID_ANY))
> > -                   return sk_vsock(vsk);
> > +                addr->svm_cid == VMADDR_CID_ANY) &&
> > +                vsock_net_check_mode(sock_net(sk), vsk->net_mode, net,
> > +                                     net_mode))
> > +                   return sk;
> >     }
> > 
> >     return NULL;
> > }
> > 
> > -static struct sock *__vsock_find_connected_socket(struct sockaddr_vm *src,
> > -                                             struct sockaddr_vm *dst)
> > +static struct sock *
> > +__vsock_find_connected_socket_net(struct sockaddr_vm *src,
> > +                             struct sockaddr_vm *dst, struct net *net,
> > +                             enum vsock_net_mode net_mode)
> > {
> >     struct vsock_sock *vsk;
> > 
> >     list_for_each_entry(vsk, vsock_connected_s)ckets(src, dst),
> >                         connected_table) {
> > +           struct sock *sk = sk_vsock(vsk);
> > +
> >             if (vsock_addr_equals_addr(src, &vsk->remote_addr) &&
> > -               dst->svm_port == vsk->local_addr.svm_port) {
> > -                   return sk_vsock(vsk);
> > +               dst->svm_port == vsk->local_addr.svm_port &&
> > +               vsock_net_check_mode(sock_net(sk), vsk->net_mode, net,
> > +                                    net_mode)) {
> > +                   return sk;
> >             }
> >     }
> > 
> > @@ -304,12 +368,14 @@ void vsock_remove_connected(struct vsock_sock *vsk)
> > }
> > EXPORT_SYMBOL_GPL(vsock_remove_connected);
> > 
> > -struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr)
> > +struct sock *vsock_find_bound_socket_net(struct sockaddr_vm *addr,
> > +                                    struct net *net,
> > +                                    enum vsock_net_mode net_mode)
> > {
> >     struct sock *sk;
> > 
> >     spin_lock_bh(&vsock_table_lock);
> > -   sk = __vsock_find_bound_socket(addr);
> > +   sk = __vsock_find_bound_socket_net(addr, net, net_mode);
> >     if (sk)
> >             sock_hold(sk);
> > 
> > @@ -317,15 +383,23 @@ struct sock *vsock_find_bound_socket(struct 
> > sockaddr_vm *addr)
> > 
> >     return sk;
> > }
> > +EXPORT_SYMBOL_GPL(vsock_find_bound_socket_net);
> > +
> > +struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr)
> > +{
> > +   return vsock_find_bound_socket_net(addr, NULL, VSOCK_NET_MODE_GLOBAL);
> 
> The patch LGTM, my last doubt now is if here (and in
> vsock_find_connected_socket() ) we should use `init_net`.
> 
> In practice, this is the namespace (NULL) and mode (GLOBAL) used by
> transports that do not support namespaces.
> 
> So here we are making them belong to no namespace, so they can only reach
> global ones. When any namespace, including `init_net`, switches to local, it
> can no longer be reached by transports that do not support local namespaces,
> because in practice we still do not have a way to associate a device (in the
> case of drivers) with a specific namespace.  Right?

Right.

> 
> If I get it right, it can makes sense, but I'd like an ack from net
> maintainers to be sure we are doing the right things.
> 
> Also I think we should have a comment on top of this function to make it
> clear that should be used only by transport that doesn't support namespace,
> and the reason why we used NULL and GLOBAL. Plus a comment on top of this
> file (near where we described local vs global) to clarify the status of
> this.
> 
> That said, if next week net-next will close, I think we can send a follow-up
> patch just for those comments, so:

Sounds good, I'll wait for further feedback before sending anything!

> 
> Reviewed-by: Stefano Garzarella <[email protected]>
> 

> > +}
> > EXPORT_SYMBOL_GPL(vsock_find_bound_socket);
> > 
> > -struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
> > -                                    struct sockaddr_vm *dst)
> > +struct sock *vsock_find_connected_socket_net(struct sockaddr_vm *src,
> > +                                        struct sockaddr_vm *dst,
> > +                                        struct net *net,
> > +                                        enum vsock_net_mode net_mode)
> > {
> >     struct sock *sk;
> > 
> >     spin_lock_bh(&vsock_table_lock);
> > -   sk = __vsock_find_connected_socket(src, dst);
> > +   sk = __vsock_find_connected_socket_net(src, dst, net, net_mode);
> >     if (sk)
> >             sock_hold(sk);
> > 
> > @@ -333,6 +407,14 @@ struct sock *vsock_find_connected_socket(struct 
> > sockaddr_vm *src,
> > 
> >     return sk;
> > }
> > +EXPORT_SYMBOL_GPL(vsock_find_connected_socket_net);
> > +
> > +struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
> > +                                    struct sockaddr_vm *dst)
> > +{
> > +   return vsock_find_connected_socket_net(src, dst,
> > +                                          NULL, VSOCK_NET_MODE_GLOBAL);
> > +}
> > EXPORT_SYMBOL_GPL(vsock_find_connected_socket);
> 

Reply via email to