The branch main has been updated by glebius:

URL: 
https://cgit.FreeBSD.org/src/commit/?id=69c05f42871406b4b2b2dac00a268d1da0cacd3e

commit 69c05f42871406b4b2b2dac00a268d1da0cacd3e
Author:     Gleb Smirnoff <gleb...@freebsd.org>
AuthorDate: 2025-02-22 02:11:17 +0000
Commit:     Gleb Smirnoff <gleb...@freebsd.org>
CommitDate: 2025-02-22 02:11:17 +0000

    udp: make sendto(2) on unconnected UDP socket use public inpcb KPIs
    
    UDP allows to sendto(2) on unconnected socket.  The original BSD devise
    was that such action would create a temporary (for the duration of the
    syscall) connection between our inpcb and remote addr:port specified in
    sockaddr 'to' of the syscall.  This devise was broken in 2002 in
    90162a4e87f0.  For more motivation on the removal of the temporary
    connection see email [1].
    
    Since the removal of the true temporary connection the sendto(2) on
    unconnected socket has the following side effects:
    
    1) After first sendto(2) the "unconnected" socket will receive datagrams
       destined to the selected port.
    2) All subsequent sendto(2) calls will use the same source port.
    
    Effectively, such sendto(2) acts like a bind(2) to INADDR_ANY:0.  Indeed,
    if you do this:
    
            s1 = socket(PF_INET, SOCK_DGRAM, 0);
            s2 = socket(PF_INET, SOCK_DGRAM, 0);
            sendto(s1, ..., &somedestination, ...);
            bind(s2, &{ .sin_addr = INADDR_ANY, sin_port = 0 });
    
    And then look into kgdb at resulting inpcbs, you would find them equal in
    all means modulo bound to different anonymous ports.
    
    What is even more interesting is that Linux kernel had picked up same
    behavior, including that "unconnected" socket will receive datagrams.  So
    it seems that such behavior is now an undocumented standard, thus I
    covered it in recently added tests/sys/netinet/udp_bindings.
    
    Now, with the above knowledge at hand, why are we using
    in_pcbconnect_setup() and in_pcbinshash(), which are supposed to be
    private to in_pcb.c, to achieve the binding?  Let's use public KPI
    in_pcbbind() on the first sendto(2) and use in_pcbladdr() on all
    sendto(2)s.  Apart from finally hiding these two should be private
    functions, we no longer acquire global INP_HASH_WLOCK() for every
    sendto(2) on unconnected socket as well as remove a couple workarounds.
    
    [1] https://mail-archive.FreeBSD.org/cgi/mid.cgi?200210141935.aa83883
    
    Reviewed by:            markj
    Differential Revision:  https://reviews.freebsd.org/D49043
---
 sys/netinet/udp_usrreq.c | 88 ++++++++++++++++--------------------------------
 1 file changed, 29 insertions(+), 59 deletions(-)

diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c
index 6c855e7a756b..131242ce9859 100644
--- a/sys/netinet/udp_usrreq.c
+++ b/sys/netinet/udp_usrreq.c
@@ -1135,10 +1135,9 @@ udp_send(struct socket *so, int flags, struct mbuf *m, 
struct sockaddr *addr,
        sin = (struct sockaddr_in *)addr;
 
        /*
-        * udp_send() may need to temporarily bind or connect the current
-        * inpcb.  As such, we don't know up front whether we will need the
-        * pcbinfo lock or not.  Do any work to decide what is needed up
-        * front before acquiring any locks.
+        * udp_send() may need to bind the current inpcb.  As such, we don't
+        * know up front whether we will need the pcbinfo lock or not.  Do any
+        * work to decide what is needed up front before acquiring any locks.
         *
         * We will need network epoch in either case, to safely lookup into
         * pcb hash.
@@ -1292,66 +1291,37 @@ udp_send(struct socket *so, int flags, struct mbuf *m, 
struct sockaddr *addr,
                error = prison_remote_ip4(td->td_ucred, &sin->sin_addr);
                if (error)
                        goto release;
-
                /*
-                * If a local address or port hasn't yet been selected, or if
-                * the destination address needs to be rewritten due to using
-                * a special INADDR_ constant, invoke in_pcbconnect_setup()
-                * to do the heavy lifting.  Once a port is selected, we
-                * commit the binding back to the socket; we also commit the
-                * binding of the address if in jail.
-                *
-                * If we already have a valid binding and we're not
-                * requesting a destination address rewrite, use a fast path.
+                * sendto(2) on unconnected UDP socket results in implicit
+                * binding to INADDR_ANY and anonymous port.  This has two
+                * side effects:
+                * 1) after first sendto(2) the socket will receive datagrams
+                *    destined to the selected port.
+                * 2) subsequent sendto(2) calls will use the same source port.
                 */
-               if (inp->inp_laddr.s_addr == INADDR_ANY ||
-                   inp->inp_lport == 0 ||
-                   sin->sin_addr.s_addr == INADDR_ANY ||
-                   sin->sin_addr.s_addr == INADDR_BROADCAST) {
-                       if ((flags & PRUS_IPV6) != 0) {
-                               vflagsav = inp->inp_vflag;
-                               inp->inp_vflag |= INP_IPV4;
-                               inp->inp_vflag &= ~INP_IPV6;
-                       }
+               if (inp->inp_lport == 0) {
+                       struct sockaddr_in wild = {
+                               .sin_family = AF_INET,
+                               .sin_len = sizeof(struct sockaddr_in),
+                       };
+
                        INP_HASH_WLOCK(pcbinfo);
-                       error = in_pcbconnect_setup(inp, sin, &laddr.s_addr,
-                           &lport, &faddr.s_addr, &fport, td->td_ucred);
-                       if ((flags & PRUS_IPV6) != 0)
-                               inp->inp_vflag = vflagsav;
-                       if (error) {
-                               INP_HASH_WUNLOCK(pcbinfo);
+                       error = in_pcbbind(inp, &wild, V_udp_bind_all_fibs ?
+                           0 : INPBIND_FIB, td->td_ucred);
+                       INP_HASH_WUNLOCK(pcbinfo);
+                       if (error)
+                               goto release;
+                       lport = inp->inp_lport;
+                       laddr = inp->inp_laddr;
+               }
+               if (laddr.s_addr == INADDR_ANY) {
+                       error = in_pcbladdr(inp, &sin->sin_addr, &laddr,
+                           td->td_ucred);
+                       if (error)
                                goto release;
-                       }
-
-                       /*
-                        * XXXRW: Why not commit the port if the address is
-                        * !INADDR_ANY?
-                        */
-                       /* Commit the local port if newly assigned. */
-                       if (inp->inp_laddr.s_addr == INADDR_ANY &&
-                           inp->inp_lport == 0) {
-                               INP_WLOCK_ASSERT(inp);
-                               /*
-                                * Remember addr if jailed, to prevent
-                                * rebinding.
-                                */
-                               if (prison_flag(td->td_ucred, PR_IP4))
-                                       inp->inp_laddr = laddr;
-                               inp->inp_lport = lport;
-                               error = in_pcbinshash(inp);
-                               INP_HASH_WUNLOCK(pcbinfo);
-                               if (error != 0) {
-                                       inp->inp_lport = 0;
-                                       error = EAGAIN;
-                                       goto release;
-                               }
-                               inp->inp_flags |= INP_ANONPORT;
-                       } else
-                               INP_HASH_WUNLOCK(pcbinfo);
-               } else {
-                       faddr = sin->sin_addr;
-                       fport = sin->sin_port;
                }
+               faddr = sin->sin_addr;
+               fport = sin->sin_port;
        } else {
                INP_LOCK_ASSERT(inp);
                faddr = inp->inp_faddr;

Reply via email to