On Fri, Dec 27, 2024 at 08:48:48AM +0000, Paul Vixie wrote: > On Tuesday, December 24, 2024 3:34:45 AM UTC Santiago Martinez wrote: > > Hi, > > here’s another user of fibs. Each of our servers have multiple fibs and > > jails with fibs. I like the proposed. > > Santi > > Cool. Read on. > > On Tuesday, December 24, 2024 5:06:32 AM UTC Jamie Landeg-Jones wrote: > > Paul Vixie <p...@redbarn.org> wrote: > > > ... > > I like that. I isolate 5 seperate networks by assigning a fib to each > > interface, and was initially surprised that I had to jump through ipfw > > hoops to get it to work properly, in fact at the end of my ipfw rules for > > these interfaces, just to guarantee no leaking, ... > > > > So, yes, I agree that it's crocky, and your proposal is how I originally > > expected it to work, and indeed, I can so no reason for it not to work that > > way, but am prepared to be enlightened if anyone else has an opinion on > > this. > > > > Jamie > > Groovy. See attached patch. This is just for TCP since I have no way to test > SCTP and I > think UDP will have to be handled at the application layer. There are two one > line changes > here. > > First, save the FIB number from the SYN in the syncache. This FIB number was > in the > incoming m_pkthdr so I didn't need to change any function signatures. Note > that if the > listener socket has a non-zero FIB number it will be used instead of the > interface FIB > number -- it's more specific and likely to be right. > > Second, when the initial ACK arrives and it's time for the connection to exit > from the > syncache and to become a socket, restore the original FIB number and apply it > to the > cloned socket, which will already have inherited its FIB number from the > listener socket. > > This works here. The diff is for a 14.2 kernel but is likely > backward-portable. I'd very much > like to hear anybody's experience with this patch, or commentary on its > approach and/or > advisability.
I think the patch is probably a good idea, and the trick of only inheriting the packet's FIB if the socket's is non-zero would avoid breaking compatibility for most cases. One side effect of the patch is that a service listening in FIB 0 that has no route to the source address of a connection attempt from a different FIB would previously not accept such a connection, but now it will. Perhaps that's drastic enough to warrant a sysctl and/or sockopt to control this behaviour. > diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > index 83f85a50e..0e030f24f 100644 > --- a/sys/netinet/tcp_input.c > +++ b/sys/netinet/tcp_input.c > @@ -1057,7 +1057,7 @@ tcp_input_with_port(struct mbuf **mp, int *offp, int > proto, uint16_t port) > } > inc.inc_fport = th->th_sport; > inc.inc_lport = th->th_dport; > - inc.inc_fibnum = so->so_fibnum; > + inc.inc_fibnum = so->so_fibnum || m->m_pkthdr.fibnum; > > /* > * Check for an existing connection attempt in syncache if > diff --git a/sys/netinet/tcp_syncache.c b/sys/netinet/tcp_syncache.c > index 15244a61d..a50648fa5 100644 > --- a/sys/netinet/tcp_syncache.c > +++ b/sys/netinet/tcp_syncache.c > @@ -805,6 +805,7 @@ syncache_socket(struct syncache *sc, struct socket *lso, > struct mbuf *m) > */ > if ((so = solisten_clone(lso)) == NULL) > goto allocfail; > + so->so_fibnum = sc->sc_inc.inc_fibnum; It would be better to pass the fibnum to solisten_clone() and assign it there. Otherwise, the value of so_fibnum will be wrong for a short window during which the socket is passed to MAC and other hooks, which might have some surprising effects. > #ifdef MAC > mac_socketpeer_set_from_mbuf(m, so); > #endif