rte_fib network order bug
Robin Jarry
rjarry at redhat.com
Thu Nov 14 11:18:34 CET 2024
Hi folks,
Morten Brørup, Nov 14, 2024 at 08:43:
> Medvedkin, Vladimir:
>> I think control plane API should work with prefix addresses in CPU
>> byte order. At least our RTE_IPV4 macro works this way. Also, prefix
>> is an address + prefix length (not the mask), so it is more natural
>> if address is in cpu byte order.
This may get into a religion debate, but in my opinion, an IPv4 address
is *not* an integer. It should be treated as an opaque value. RTE_IPV4
is only useful to define addresses in unit tests.
I do not know of any IPv4 stack implementation that deals with
*host order* addresses. Here are a couple of examples where all
addresses are stored in network order in the control plane:
https://elixir.bootlin.com/linux/v6.11.6/source/net/ipv4/fib_frontend.c#L1069
https://github.com/freebsd/freebsd-src/blob/release/14.1.0/sys/net/route/route_ctl.c#L692
https://git.fd.io/vpp/tree/src/vnet/fib/fib_table.c?h=v24.10#n237
>>
>> Also, I think byte swap should be done on the interface where byte
>> order changes, and this boundary lies outside the FIB library.
>> However, I've added this feature not only because it was asked, but
>> also trying to improve performance in some cases, such as using
>> AVX512 byte swap in vector path for users who don't want to bother
>> about manually do byteswap on the fast path.
>>
>> Why do you think this would discourage users?
>
> Joining the discussing with a couple of comments.
>
> 1. When I saw the byte order flag the first time, it was not clear to
> me either that it only affected lookups - I too thought it covered
> the entire API of the library. This needs to be emphasized in the
> description of the flag. And the flag's name should contain LOOKUP,
> e.g.:
>
> /** If set, FIB lookup is expecting IPv4 address in network byte order. Note: Only lookup! */
> #define RTE_FIB_F_LOOKUP_NETWORK_ORDER 1
>
> 2. Control plane API should use CPU byte order. I consider inet_pton()
> irrelevant in this context. Adding network byte order lookup for
> fast path optimization makes good sense, and adding it to the RIB
> library would be nice too.
>
> If it was an address table (not longest prefix table, but a hash or
> similar), the learn()/update() function could be fast path, and
> thus support network byte order too; but it's not, so add() is
> control plane.
Why would control plane use a different representation of addresses
compared to data plane? Also for consistency with IPv6, I really think
that *all* addresses should be dealt in their network form.
Cheers.
More information about the dev
mailing list