On Fri, Nov 05, 2021 at 08:31:07AM +0100, Martin Pieuchot wrote:
> On 26/10/21(Tue) 14:12, Vitaliy Makkoveev wrote:
> > Another step to make UNIX sockets locking fine grained.
> > 
> > The listening socket has the references from file descriptors layer and
> > from the vnode(9) layer. This means when we close(2)'ing such socket it
> > still referenced by concurrent thread through connect(2) path.
> > 
> > When we bind(2) UNIX socket we link it to vnode(9) by assigning
> > `v_socket'. When we connect(2)'ing socket to the socket we previously
> > bind(2)'ed we finding it by namei(9) and obtain it's reference through
> > `v_socket'. This socket has no extra reference in file descriptor
> > layer and could be closed by concurrent thread.
> > 
> > This time we have `unp_lock' rwlock(9) which protects the whole layer
> > and the dereference of `v_socket' is safe. But with the fine grained
> > locking the `v_socket' will not be protected by global lock. When we
> > obtain the vnode(9) by namei(9) in connect(9) or bind(9) paths it is
> > already exclusively locked by vlode(9) lock. But in unp_detach() which
> > is called on the close(2)'ing socket we assume `unp_lock' protects
> > `v_socket'.
> > 
> > I propose to use exclusive vnode(9) lock to protect `v_socket'. With the
> > fine grained locking, the `v_socket' dereference in unp_bind() or
> > unp_connect() threads will be safe because unp_detach() thread will wait
> > the vnode(9) lock release. The vnode referenced by `unp_vnod' has
> > reference counter bumped so it's dereference is also safe without
> > `unp_lock' held.
> 
> This makes sense to me.  Using the vnode lock here seems the simplest
> approach.
> 
> > The `i_lock' should be take before `unp_lock' and unp_detach() should
> > release solock(). To prevent connections on this socket the
> > 'SO_ACCEPTCONN' bit cleared in soclose().
> 
> This is done to prevent races when solock() is released inside soabort(),
> right?  Is it the only one or some more care is needed?
> Will this stay with per-socket locks or is this only necessary because of
> the global `unp_lock'?
> 

The only "binded" sockets has the associated vnode(9). We link them
when we perform unp_bind(). I used the "binded" instead of "listening"
because datagram sockets could also be binded but they are not
connection oriented.

soabort() kills the not accept(2)ed peers of connection oriented sockets
when we close "listening" socket. I used "listening" because the socket
was bind(2)ed and then marked as "assepting connections" by listen(2).

Since the unaccepted peers we kills with soabort() has no associated
vnode(9) we don't release `unp_lock' because we don't need to call
vrele(9).

This diff solves lock dances in unp_connect(). Our thread own `so' we
passed to unp_connect(). It's file descriptor has refcouter bumped and
the `so' can't be killed by concurrent thread. But we get `so2' the
"binded" socket from vnode(9).

This time the global `unp_lock' protects both sockets, but with the
per-socket locking the only `so' will be protected. So the `so2'
dereference will be unsafe because concurrent thread could kill it.

Also we could release `unp_lock' within unp_attach(). So unp_attach()
called by sonewconn() within unp_connect() could release `unp_lock' too.
The upcoming diff which moves unp_gc() stuff from `unp_lock' will
release `unp_lock' in unp_detach() to enforce lock order `unp_gc_lock'
-> `unp_lock'.

We keep vnode(9) locked when we link the socket with vnode(9) in
unp_bind(). We also keep vnode(9) locked when we connect sockets in
unp_connect(). So unp_detach() should also lock the vnode(9) because
vnode(9) lock protects `v_socket'.

This will be actual with the upcoming fine grained locks implementation.
This time this diff mostly makes `v_socket' protection consistent.

Could I commit this diff?

> > Index: sys/kern/uipc_socket.c
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/uipc_socket.c,v
> > retrieving revision 1.265
> > diff -u -p -r1.265 uipc_socket.c
> > --- sys/kern/uipc_socket.c  14 Oct 2021 23:05:10 -0000      1.265
> > +++ sys/kern/uipc_socket.c  26 Oct 2021 11:05:59 -0000
> > @@ -315,6 +315,8 @@ soclose(struct socket *so, int flags)
> >     /* Revoke async IO early. There is a final revocation in sofree(). */
> >     sigio_free(&so->so_sigio);
> >     if (so->so_options & SO_ACCEPTCONN) {
> > +           so->so_options &= ~SO_ACCEPTCONN;
> > +
> >             while ((so2 = TAILQ_FIRST(&so->so_q0)) != NULL) {
> >                     (void) soqremque(so2, 0);
> >                     (void) soabort(so2);
> > Index: sys/kern/uipc_usrreq.c
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> > retrieving revision 1.150
> > diff -u -p -r1.150 uipc_usrreq.c
> > --- sys/kern/uipc_usrreq.c  21 Oct 2021 22:11:07 -0000      1.150
> > +++ sys/kern/uipc_usrreq.c  26 Oct 2021 11:05:59 -0000
> > @@ -474,20 +474,30 @@ void
> >  unp_detach(struct unpcb *unp)
> >  {
> >     struct socket *so = unp->unp_socket;
> > -   struct vnode *vp = NULL;
> > +   struct vnode *vp = unp->unp_vnode;
> >  
> >     rw_assert_wrlock(&unp_lock);
> >  
> >     LIST_REMOVE(unp, unp_link);
> > -   if (unp->unp_vnode) {
> > +
> > +   if (vp) {
> > +           unp->unp_vnode = NULL;
> > +
> >             /*
> > -            * `v_socket' is only read in unp_connect and
> > -            * unplock prevents concurrent access.
> > +            * Enforce `i_lock' -> `unp_lock' because fifo
> > +            * subsystem requires it.
> >              */
> >  
> > -           unp->unp_vnode->v_socket = NULL;
> > -           vp = unp->unp_vnode;
> > -           unp->unp_vnode = NULL;
> > +           sounlock(so, SL_LOCKED);
> > +
> > +           VOP_LOCK(vp, LK_EXCLUSIVE);
> > +           vp->v_socket = NULL;
> > +
> > +           KERNEL_LOCK();
> > +           vput(vp);
> > +           KERNEL_UNLOCK();
> > +
> > +           solock(so);
> >     }
> >  
> >     if (unp->unp_conn)
> > @@ -500,21 +510,6 @@ unp_detach(struct unpcb *unp)
> >     pool_put(&unpcb_pool, unp);
> >     if (unp_rights)
> >             task_add(systqmp, &unp_gc_task);
> > -
> > -   if (vp != NULL) {
> > -           /*
> > -            * Enforce `i_lock' -> `unplock' because fifo subsystem
> > -            * requires it. The socket can't be closed concurrently
> > -            * because the file descriptor reference is
> > -            * still hold.
> > -            */
> > -
> > -           sounlock(so, SL_LOCKED);
> > -           KERNEL_LOCK();
> > -           vrele(vp);
> > -           KERNEL_UNLOCK();
> > -           solock(so);
> > -   }
> >  }
> >  
> >  int
> > 
> 

Reply via email to