On Fri, Aug 23, 2024 at 03:34:09PM +0200, Aperence wrote: (...) > MPTCP is both supported for the frontend and backend sides.
Great! > Also added an example of configuration using mptcp along with a backend > allowing to experiment with it. Thanks for thinking about testing ;-) > Note that this is a re-implementation of Björn's work from 3 years ago > [4], when haproxy's internals were probably less ready to deal with > this, causing his work to be left pending for a while. Thanks for mentioning his initial work! > diff --git a/include/haproxy/compat.h b/include/haproxy/compat.h > index 3829060b7..2347d62cb 100644 > --- a/include/haproxy/compat.h > +++ b/include/haproxy/compat.h > @@ -317,6 +317,11 @@ typedef struct { } empty_t; > #define queue _queue > #endif > > +/* only Linux defines IPPROTO_MPTCP */ > +#ifndef IPPROTO_MPTCP > +#define IPPROTO_MPTCP 262 > +#endif Stupid open question: do we really want to define it for other OSes ? I really don't care, to be honest. > diff --git a/src/backend.c b/src/backend.c > index 6956d9bfe..6b865768e 100644 > --- a/src/backend.c > +++ b/src/backend.c > @@ -1690,8 +1690,15 @@ int connect_server(struct stream *s) > > if (!srv_conn->xprt) { > /* set the correct protocol on the output stream connector */ > + int mptcp = 0; > + > + /* cli_conn can be NULL when the origin of the stream isn't a > + * connection, there's no reason to use MPTCP in this case */ > + if (cli_conn && cli_conn->ctrl) > + mptcp = cli_conn->ctrl->sock_prot == IPPROTO_MPTCP; I don't understand this part, it seems to imply a relation between the client and the server. I'm not seeing any case where this could be valid since the config of the client and the server are two totally independent things. I'm suspecting that the difficulty here is to know that the server's address was configured as MPTCP and that since you couldn't find where to store that piece of info, instead it's inherited from the client. But that cannot be right as it's really a proxy and there's no relation between the protocols used on each side. For example it's valid and normal to have QUIC on the front and HTTP/2 over TCP (or MPTCP soon) on the back. If that's the issue, I suspect that we'll have to store the info in the struct server itself that we want to use mptcp. BTW, based on the extension you brought to protocol_lookup(), maybe we can have a more generic field like "alt_proto" or something like this that is passed directly to protocol_lookup(). This info needs to be retrieved from the address parser called by _srv_parse_init(). I'm seeing nowhere to store the "alt" value, and that might be an indication that str2sa_range() needs to be extended to take yet another arg to return that one, which you're already setting to !!mptcp during the lookup. As such I would suggest the following: - a preliminary patch extending str2sa_range() to add an int *alt before the last "opts" parameter, and that is filled with the value passed to protocol_lookup() during the function if the pointer is not NULL. You can put in that patch the change of definition of protocol_lookup() to use alt instead of dgram BTW. - a second (small) patch adding a field alt_proto to the "server" struct (probably close to net_ns, xprt etc). This field would then be filled in _srv_parse_init() from the "alt" argument returned by str2sa_range(). - then this current patch becomes the third and makes use of srv->alt instead of !!mptcp below, for the protocol lookup: > if (srv) { > - if (conn_prepare(srv_conn, > protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, 0), srv->xprt)) { > + if (conn_prepare(srv_conn, > protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, !!mptcp), > srv->xprt)) { (...) > diff --git a/src/proto_tcp.c b/src/proto_tcp.c > index d6552b2f1..9cabae11b 100644 > --- a/src/proto_tcp.c > +++ b/src/proto_tcp.c > @@ -149,6 +149,102 @@ struct protocol proto_tcpv6 = { > > INITCALL1(STG_REGISTER, protocol_register, &proto_tcpv6); > > +#ifdef __linux__ > +/* Most fields are copied from proto_tcpv4 */ > +struct protocol proto_mptcpv4 = { > + .name = "mptcpv4", (...) > + .receivers = LIST_HEAD_INIT(proto_mptcpv4.receivers), > + .nb_receivers = 0, Thanks to the very latest changes in 3.1-dev6 that you rebased on, you can drop these two lines, they're now initialized by protocol_register(). I figured I got trapped 3 times in less than one hour, it was too much! > diff --git a/src/sock.c b/src/sock.c > index df82c6ea7..e32573b7e 100644 > --- a/src/sock.c > +++ b/src/sock.c > @@ -278,7 +278,7 @@ int sock_create_server_socket(struct connection *conn, > struct proxy *be, int *st > ns = __objt_server(conn->target)->netns; > } > #endif > - sock_fd = my_socketat(ns, conn->dst->ss_family, SOCK_STREAM, 0); > + sock_fd = my_socketat(ns, conn->dst->ss_family, SOCK_STREAM, > conn->ctrl->sock_prot); I would like this change alone in one of the first patches, please. If we find that it happens to break existing setups, it's important that we can quickly spot it during a bisect. It's a change on its own that's separate from the first 3 I mentioned above. You can insert it between the 2nd and the 3rd above and mention that we'll need to pass the correct IPPROTO_* to the socket() call, which we didn't do till now, only passing zero. > /* at first, handle common to all proto families system limits and > permission related errors */ > if (sock_fd == -1) { > @@ -303,7 +303,7 @@ int sock_create_server_socket(struct connection *conn, > struct proxy *be, int *st > } > > if (fd_set_nonblock(sock_fd) == -1 || > - ((conn->ctrl->sock_prot == IPPROTO_TCP) && (setsockopt(sock_fd, > IPPROTO_TCP, TCP_NODELAY, &one, sizeof(one)) == -1))) { > + ((conn->ctrl->sock_prot == IPPROTO_TCP || conn->ctrl->sock_prot > == IPPROTO_MPTCP) && (setsockopt(sock_fd, IPPROTO_TCP, TCP_NODELAY, &one, > sizeof(one)) == -1))) { Please split that exceptionally large line after &&, it wraps here in my window :-) > diff --git a/src/tools.c b/src/tools.c > index 15756c880..e0e871745 100644 > --- a/src/tools.c > +++ b/src/tools.c > @@ -977,6 +977,7 @@ struct sockaddr_storage *str2sa_range(const char *str, > int *port, int *low, int > int new_fd = -1; > enum proto_type proto_type = 0; // to shut gcc warning > int ctrl_type = 0; // to shut gcc warning > + int mptcp = 0; I suspect that you can now call it "alt_proto" and pass it as-is. Better, do that as part of the 1st patch above, and set it to 1 every time we set ctrl_type = SOCK_DGRAM. That way you can pass it as-is to protocol_lookup() at the end and return it into *alt when non-null. This way callers will automatically benefit form it and will have the proper value to pass to protocol_lookup() for subsequent calls. And the pure mptcp patch will only need to set alt_proto to 1 where you currently set mptcp to 1. > + else if (strncmp(str2, "mptcp6@", 7) == 0) { > + str2 += 7; > + ss.ss_family = AF_INET; Be careful, AF_INET6 here! Please let me know if any of my comments are unclear or if you're missing some info about how things work. I really want to make sure we don't make the server side randomly rely on the client side but instead correcly use the configured address. Thanks! Willy