The branch main has been updated by glebius: URL: https://cgit.FreeBSD.org/src/commit/?id=69c05f42871406b4b2b2dac00a268d1da0cacd3e
commit 69c05f42871406b4b2b2dac00a268d1da0cacd3e Author: Gleb Smirnoff <gleb...@freebsd.org> AuthorDate: 2025-02-22 02:11:17 +0000 Commit: Gleb Smirnoff <gleb...@freebsd.org> CommitDate: 2025-02-22 02:11:17 +0000 udp: make sendto(2) on unconnected UDP socket use public inpcb KPIs UDP allows to sendto(2) on unconnected socket. The original BSD devise was that such action would create a temporary (for the duration of the syscall) connection between our inpcb and remote addr:port specified in sockaddr 'to' of the syscall. This devise was broken in 2002 in 90162a4e87f0. For more motivation on the removal of the temporary connection see email [1]. Since the removal of the true temporary connection the sendto(2) on unconnected socket has the following side effects: 1) After first sendto(2) the "unconnected" socket will receive datagrams destined to the selected port. 2) All subsequent sendto(2) calls will use the same source port. Effectively, such sendto(2) acts like a bind(2) to INADDR_ANY:0. Indeed, if you do this: s1 = socket(PF_INET, SOCK_DGRAM, 0); s2 = socket(PF_INET, SOCK_DGRAM, 0); sendto(s1, ..., &somedestination, ...); bind(s2, &{ .sin_addr = INADDR_ANY, sin_port = 0 }); And then look into kgdb at resulting inpcbs, you would find them equal in all means modulo bound to different anonymous ports. What is even more interesting is that Linux kernel had picked up same behavior, including that "unconnected" socket will receive datagrams. So it seems that such behavior is now an undocumented standard, thus I covered it in recently added tests/sys/netinet/udp_bindings. Now, with the above knowledge at hand, why are we using in_pcbconnect_setup() and in_pcbinshash(), which are supposed to be private to in_pcb.c, to achieve the binding? Let's use public KPI in_pcbbind() on the first sendto(2) and use in_pcbladdr() on all sendto(2)s. Apart from finally hiding these two should be private functions, we no longer acquire global INP_HASH_WLOCK() for every sendto(2) on unconnected socket as well as remove a couple workarounds. [1] https://mail-archive.FreeBSD.org/cgi/mid.cgi?200210141935.aa83883 Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D49043 --- sys/netinet/udp_usrreq.c | 88 ++++++++++++++++-------------------------------- 1 file changed, 29 insertions(+), 59 deletions(-) diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index 6c855e7a756b..131242ce9859 100644 --- a/sys/netinet/udp_usrreq.c +++ b/sys/netinet/udp_usrreq.c @@ -1135,10 +1135,9 @@ udp_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *addr, sin = (struct sockaddr_in *)addr; /* - * udp_send() may need to temporarily bind or connect the current - * inpcb. As such, we don't know up front whether we will need the - * pcbinfo lock or not. Do any work to decide what is needed up - * front before acquiring any locks. + * udp_send() may need to bind the current inpcb. As such, we don't + * know up front whether we will need the pcbinfo lock or not. Do any + * work to decide what is needed up front before acquiring any locks. * * We will need network epoch in either case, to safely lookup into * pcb hash. @@ -1292,66 +1291,37 @@ udp_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *addr, error = prison_remote_ip4(td->td_ucred, &sin->sin_addr); if (error) goto release; - /* - * If a local address or port hasn't yet been selected, or if - * the destination address needs to be rewritten due to using - * a special INADDR_ constant, invoke in_pcbconnect_setup() - * to do the heavy lifting. Once a port is selected, we - * commit the binding back to the socket; we also commit the - * binding of the address if in jail. - * - * If we already have a valid binding and we're not - * requesting a destination address rewrite, use a fast path. + * sendto(2) on unconnected UDP socket results in implicit + * binding to INADDR_ANY and anonymous port. This has two + * side effects: + * 1) after first sendto(2) the socket will receive datagrams + * destined to the selected port. + * 2) subsequent sendto(2) calls will use the same source port. */ - if (inp->inp_laddr.s_addr == INADDR_ANY || - inp->inp_lport == 0 || - sin->sin_addr.s_addr == INADDR_ANY || - sin->sin_addr.s_addr == INADDR_BROADCAST) { - if ((flags & PRUS_IPV6) != 0) { - vflagsav = inp->inp_vflag; - inp->inp_vflag |= INP_IPV4; - inp->inp_vflag &= ~INP_IPV6; - } + if (inp->inp_lport == 0) { + struct sockaddr_in wild = { + .sin_family = AF_INET, + .sin_len = sizeof(struct sockaddr_in), + }; + INP_HASH_WLOCK(pcbinfo); - error = in_pcbconnect_setup(inp, sin, &laddr.s_addr, - &lport, &faddr.s_addr, &fport, td->td_ucred); - if ((flags & PRUS_IPV6) != 0) - inp->inp_vflag = vflagsav; - if (error) { - INP_HASH_WUNLOCK(pcbinfo); + error = in_pcbbind(inp, &wild, V_udp_bind_all_fibs ? + 0 : INPBIND_FIB, td->td_ucred); + INP_HASH_WUNLOCK(pcbinfo); + if (error) + goto release; + lport = inp->inp_lport; + laddr = inp->inp_laddr; + } + if (laddr.s_addr == INADDR_ANY) { + error = in_pcbladdr(inp, &sin->sin_addr, &laddr, + td->td_ucred); + if (error) goto release; - } - - /* - * XXXRW: Why not commit the port if the address is - * !INADDR_ANY? - */ - /* Commit the local port if newly assigned. */ - if (inp->inp_laddr.s_addr == INADDR_ANY && - inp->inp_lport == 0) { - INP_WLOCK_ASSERT(inp); - /* - * Remember addr if jailed, to prevent - * rebinding. - */ - if (prison_flag(td->td_ucred, PR_IP4)) - inp->inp_laddr = laddr; - inp->inp_lport = lport; - error = in_pcbinshash(inp); - INP_HASH_WUNLOCK(pcbinfo); - if (error != 0) { - inp->inp_lport = 0; - error = EAGAIN; - goto release; - } - inp->inp_flags |= INP_ANONPORT; - } else - INP_HASH_WUNLOCK(pcbinfo); - } else { - faddr = sin->sin_addr; - fport = sin->sin_port; } + faddr = sin->sin_addr; + fport = sin->sin_port; } else { INP_LOCK_ASSERT(inp); faddr = inp->inp_faddr;