On Wed, Feb 15, 2023 at 08:08:42PM +0100, Alexander Bluhm wrote:
> On Mon, Jan 30, 2023 at 05:03:30PM +0300, Vitaliy Makkoveev wrote:
> > It makes sense to push solock() down to sosetopt() too. For a half cases
> > (*pr_ctloutput)() is just null op, so there is no reason to call it.
> > Also, a lot of things could be done without solock() held.
> 
> You do a bunch of things together
> - push solock()
> - move code around
> - return error instead of error =
> - set sb to &so->so_snd or &so->so_rcv
> 
> Some of the refactorings make sense, others change the code just
> to be different in my eyes.
> 
> Can we discuss theses things in separate diffs?
> 

Sorry for big non obvious diff. I need this refactoring to keep required
lock order between solock() and the standalone sblock().

This time sblock() requires solock() to be held, moreover, the reason
for sblock() is only to exclude concurrent send()/receive()/sorflush()
while solock() released within sosend() and soreceive(). It doesn't
protect socket's buffer data, so we don't acquire it while we do socket
buffer modification within sosetopt().

But this will be changed for standalone sblock(). Also, the lock order
between sblock() and solock() will be changed too. Please note, the
`so_snd' send buffer and `so_rcv' receive buffer are different objects,
as the sblock(so_snd) and sblock(so_rcv) are different locks.

In the send path, the socket buffer lock sblock(so_snd) will always
be taken before solock(), but for receive path socket buffer lock
sblock(so_rcv) will be taken after solock(), because we do sbappend*()
within PCB layer.

So, sosetopt() needs some refactoring. We have socket buffer related
requests SO_SNDTIMEO, SO_RCVTIMEO, SO_SNDBUF, SO_RCVBUF, SO_SNDLOWAT,
SO_RCVLOWAT. Such requests modifies socket buffer data only, so only
sblock() should be used. Moreover, for `so_snd' related requests
solock() should not be acquired while sblock(so_snd) acquisition. So I
need to push solock() down to sosetopt(). But please note, the `so_snd'
and `so_rcv' are different buffers, so they require different locks, so
hypothetical sosetopt() socket buffer chunks could be like:

                case SO_SNDBUF:
                case SO_RCVBUF:
                case SO_SNDLOWAT:
                case SO_RCVLOWAT:
                    {
                        /* ... */
                        switch (optname) {
                        case SO_SNDBUF:
                                sblock(&so->so_snd);
                                /* ... */
                                sbunlock(&so->so_snd);
                                break;

                        case SO_RCVBUF:
                                sblock(&so->so_rcv);
                                /* ... */
                                sbunlock(&so->so_rcv);
                                break;

                        case SO_SNDLOWAT:
                                sblock(&so->so_snd);
                                /* ... */
                                sbunlock(&so->so_snd);
                                break;
                        case SO_RCVLOWAT:
                                sblock(&so->so_rcv);
                                /* ... */
                                sbunlock(&so->so_rcv);
                                break;
                        }
                        break;
                    }
                case SO_SNDTIMEO:
                case SO_RCVTIMEO:
                    {
                        /* ... */
                        switch (optname) {
                        case SO_SNDTIMEO:
                                sblock(&so->so_snd);
                                /* ... */
                                sbunlock(&so->so_snd);
                                break;
                        case SO_RCVTIMEO:
                                sblock(&so->so_rcv);
                                /* ... */
                                sbunlock(&so->so_rcv);
                                break;
                        }
                        break;
                    }

But the SO_SNDBUF and SO_RCVBUF, SO_SNDLOWAT and SO_RCVLOWAT,
SO_SNDTIMEO and SO_RCVTIMEO are the same, the only difference is the
socket buffer, so I simplified the code like I already did for
SO_SNDTIMEO and SO_RCVTIMEO requests in sogetopt().


                case SO_SNDBUF:
                case SO_RCVBUF:
                case SO_SNDLOWAT:
                case SO_RCVLOWAT:
                    {
                        struct sockbuf *sb = (optname == SO_SNDTIMEO ?
                            &so->so_snd : &so->so_rcv);
                        /* ... */
                        sblock(sb);
                        switch (optname) {
                        case SO_SNDBUF:
                        case SO_RCVBUF:
                                /* ... */
                                break;
                        case SO_SNDLOWAT:
                        case SO_RCVLOWAT:
                                /* ... */
                                break;
                        }
                        sbunlock(sb);
                        break;
                    }
                case SO_SNDTIMEO:
                case SO_RCVTIMEO:
                    {
                        struct sockbuf *sb = (optname == SO_SNDTIMEO ?
                            &so->so_snd : &so->so_rcv);
                        /* ... */
                        sblock(sb);
                        /* ... */
                        sbunlock(sb);
                        break;
                    }

Looks much better, right?

Also, for each successful command we always do (*pr_ctloutput)() call
which requires solock() to be taken. Hypothetically, we should not
release solock() between sosetopt() command and (*pr_ctloutput)() call
to keep consistency, so the socket buffer related cases breaks this
logic. But in fact, the most requests within (*pr_ctloutput)() handlers
are null ops, (*pr_ctloutput)() returns ENOPROTOOPT error which will be
ignored. So I decided to call (*pr_ctloutput)() only for cases which
require it. This divided the sosetopt() cases to 3 groups. The first one
requires sblock() to be held within cases, the second requires solock()
to be held within cases and doesn't require to call (*pr_ctloutput)()
and the third one requires solock() to be held outside cases and
requires to call (*pr_ctloutput)(). Th avoid solock()/sounlock()/goto
mess, I did the cases reordering.

                switch (optname) {
                case SO_SNDBUF:
                case SO_RCVBUF:
                case SO_SNDLOWAT:
                case SO_RCVLOWAT:
                    {
                        sblock(sb);
                        /* ... */
                        sbunlock(sb);
                        return (error);
                    }
                case SO_SNDTIMEO:
                case SO_RCVTIMEO:
                        sblock(sb);
                        /* ... */
                        sbunlock(sb);
                        return (0);
                case SO_RTABLE:
                        solock(so);
                        error = /* ... */
                        sounlock(so);
                        return (error);
                case SO_SPLICE:
                        solock(so);
                        error = /* ... */
                        sounlock(so);
                        return (error);
                }

                solock(so);
                switch (optname) {
                case SO_BINDANY:
                /* ... */
                case SO_ZEROIZE:
                        /* ... */
                        break;
                default:
                        error = ENOPROTOOPT;
                        break;
                }
                if (error == 0 && so->so_proto->pr_ctloutput) {
                        (*so->so_proto->pr_ctloutput)(/* ... */);
                }
                sounlock(so);

Please note, for this explanation I used sblock(), but right now it
can't be used for socket buffer protection, so in the posted diff
solock() used for buffer related cases.

I have no objections to split this diff, but don't know how to do this.
I see any modifications as non obvious. Can I propose to avoid null op
(*pr_ctloutput)() calls at first? This also requires to do cases
reordering to keep code clean.

Reply via email to