> From: Robin Jarry [mailto:rja...@redhat.com]
> 
> Morten Brørup, Jul 19, 2024 at 12:46:
> > When passing an IPv4 address as a value, alignment does matter; it
> > must be 4 byte aligned.
> 
> I was expecting the compiler to do what is necessary to copy the data to
> an aligned register before jumping to the function.

Yes, and hereby you have achieved 4-byte alignment of the parameter.

What I meant was: If the parameter's type makes the parameter explicitly 
unaligned, e.g. an unaligned array of 4 bytes or an unaligned_uint32_t, the 
code inside the function must also treat the parameter as unaligned, and cannot 
assume it has magically become 4-byte aligned.

Our functions taking an IPv4 address parameter (by value) passes the value as 
aligned.
Functions taking an IPv6 address parameter (by value) should behave exactly the 
same way: The compiler should do what is necessary to copy the data to an 
aligned register *before* jumping to the function. (Note: In 64 bit 
architectures, 128 bits requires two 64 bit registers.) The point remains: If 
conversion from unaligned to aligned is required, it is the responsibility of 
the code calling the function, not the function itself.

> 
> > On a CPU with 128 bit registers, I would probably also pass an IPv6
> > address as a value. With such a CPU, the parameter type should be
> > uint128_t or rte_be128_t, depending on byte order.
> 
> I don't think there is a portable/standard uint128_t yet. Everything
> I could find is either GCC or linux specific.

Agree. I am using uint128_t conceptually.

> 
> > There's a 3rd option:
> > Have an IPv6 type that is simply an array of 16 bytes with no
> explicitly specified alignment:
> >
> > struct rte_ipv6_addr {
> >     unsigned char addr_bytes[RTE_IPV6_ADDR_LEN];
> > };
> >
> > Or:
> >
> > typedef struct rte_ipv6_addr {
> >     unsigned char addr_bytes[RTE_IPV6_ADDR_LEN];
> > } rte_ipv6_addr_t;
> >
> > If used as is, it will be unaligned.
> >
> > And if alignment offers improved performance for some use cases,
> > explicit alignment attributes can be added to the type in those use
> > cases.
> >
> > Not using an uint128_t type (or a union of other types than unsigned
> > char) will also avoid byte order issues.
> >
> > I guess Stephen was right to begin with. :-)
> 
> Having the type as a union (as is the POSIX type) makes casting to
> integers a lot less tedious and makes the structure overall more
> flexible.

Maybe (probably?). However, if you explicitly make the type unaligned, how can 
the same type be used in an optimized way where the developer knows that it is 
16 byte aligned?

NB: There's something in the C standard about type casting from char (and 
unsigned char) being less restricted than typecasting from uint8_t, so perhaps 
using unsigned char instead of uint8_t could solve the recasting issue your 
union is trying to solve. (Unfortunately, I cannot remember the source of this 
information.)

Generally I don't think that we should introduce complex 
types/structures/unions only to simplify type casting, if it is at the expense 
of performance or code readability.

> 
> We could completely add an unaligned be128 member to the union by the
> way. I don't see what is wrong with having sub union members.

(Not that I agree to using a union, but..)
Agree. If it's a union, and alignment is explicitly set, adding 64 bit and 128 
bit sub union members should be perfectly acceptable, as it does not modify the 
alignment or anything else.

> 
> About your concern with byte order, since the union members have
> explicit rte_be*_t types, I don't think confusion can happen. I have
> also renamed the members, replacing the "u" prefix with "a" so that it
> does not indicate that it should be used as a host integer.
> 
>         struct __rte_aligned(1) rte_ipv6_addr {
>                 union {
>                         unsigned char a[16];
>                         unaligned_be16_t a16[8];
>                         unaligned_be32_t a32[4];
>                         unaligned_be64_t a64[2];
>                         unaligned_be128_t a128[1];
>                 };
>         } __rte_packed;

(Again, not that I'm accepting the structure, but...)
Yes, this would solve the byte order concern.

How do you write efficient code with this forcefully unaligned type?
Let's say an application has some structure with an IPv6 address field, which 
the developer has designed to be 16 byte aligned in the structure.
The compiler would need to always access this 16 byte aligned field as 
unaligned, because the rte_ipv6_addr type makes it explicitly unaligned.

Reply via email to