Hi folks,
Morten Brørup, Nov 14, 2024 at 08:43:
Medvedkin, Vladimir:
I think control plane API should work with prefix addresses in CPU
byte order. At least our RTE_IPV4 macro works this way. Also, prefix
is an address + prefix length (not the mask), so it is more natural
if address is in cpu byte order.
This may get into a religion debate, but in my opinion, an IPv4 address
is *not* an integer. It should be treated as an opaque value. RTE_IPV4
is only useful to define addresses in unit tests.
I do not know of any IPv4 stack implementation that deals with
*host order* addresses. Here are a couple of examples where all
addresses are stored in network order in the control plane:
https://elixir.bootlin.com/linux/v6.11.6/source/net/ipv4/fib_frontend.c#L1069
https://github.com/freebsd/freebsd-src/blob/release/14.1.0/sys/net/route/route_ctl.c#L692
https://git.fd.io/vpp/tree/src/vnet/fib/fib_table.c?h=v24.10#n237
Also, I think byte swap should be done on the interface where byte
order changes, and this boundary lies outside the FIB library.
However, I've added this feature not only because it was asked, but
also trying to improve performance in some cases, such as using
AVX512 byte swap in vector path for users who don't want to bother
about manually do byteswap on the fast path.
Why do you think this would discourage users?
Joining the discussing with a couple of comments.
1. When I saw the byte order flag the first time, it was not clear to
me either that it only affected lookups - I too thought it covered
the entire API of the library. This needs to be emphasized in the
description of the flag. And the flag's name should contain LOOKUP,
e.g.:
/** If set, FIB lookup is expecting IPv4 address in network byte order. Note:
Only lookup! */
#define RTE_FIB_F_LOOKUP_NETWORK_ORDER 1
2. Control plane API should use CPU byte order. I consider inet_pton()
irrelevant in this context. Adding network byte order lookup for
fast path optimization makes good sense, and adding it to the RIB
library would be nice too.
If it was an address table (not longest prefix table, but a hash or
similar), the learn()/update() function could be fast path, and
thus support network byte order too; but it's not, so add() is
control plane.
Why would control plane use a different representation of addresses
compared to data plane? Also for consistency with IPv6, I really think
that *all* addresses should be dealt in their network form.
Cheers.