On 2022-06-16 08:32, Emil Berg wrote:
I've been sketching on an efficient solution to this. What about something
along the way below? I've run it with the combinations of:
even buf, even len
even buf, odd len
odd buf, even len
odd buf, odd len
and it seems to give the same results as the older version of __rte_raw_cksum,
before 21.03. I ran it without optimizations and such to ensure the compiler
didn't insert vector instructions and such so the results were comparable.
...but you *want* the compiler to vectorize this code. There's much to
gain, and it can likely be done also in the non-aligned case. What you
don't want is for the compiler to assume the data is 16-bit aligned (and
output SIMD load/store instructions which require alignment).
I don't see why you just can't take the current implementation, and
replace the direct assignment ("*u16_buf") with a temporary variable,
and a memcpy(). This also eliminates the need for the may_alias
attribute (at least on the u16_buf pointer).
static inline uint32_t
__rte_raw_cksum_newest(const void *buf, size_t len, uint32_t sum)
{
const uint8_t *end = buf + len;
uint32_t sum_even = 0;
for (const uint8_t *p = buf + 1; p < end; p += 2) {
sum_even += *p;
}
sum += sum_even << 8;
uint32_t sum_odd = 0;
for (const uint8_t *p = buf; p < end; p += 2) {
sum_odd += *p;
}
sum += sum_odd;
return sum;
}
/Emil
-----Original Message-----
From: Emil Berg
Sent: den 16 juni 2022 07:45
To: Morten Brørup <m...@smartsharesystems.com>; bugzi...@dpdk.org
Cc: dev@dpdk.org
Subject: RE: [Bug 1035] __rte_raw_cksum() crash with misaligned pointer
Hi!
We want the B option, i.e. the 6 bytes option. Perhaps adding alignment
detection to __rte_raw_cksum() is a good idea.
A minor comment but I think buf & 1 won't work since buf isn't an integral
type, but something along that way.
I'm starting to think about an efficient way to do this.
Thank you!
-----Original Message-----
From: Morten Brørup <m...@smartsharesystems.com>
Sent: den 15 juni 2022 16:41
To: Emil Berg <emil.b...@ericsson.com>; bugzi...@dpdk.org
Cc: dev@dpdk.org
Subject: RE: [Bug 1035] __rte_raw_cksum() crash with misaligned pointer
From: bugzi...@dpdk.org [mailto:bugzi...@dpdk.org]
Sent: Wednesday, 15 June 2022 09.16
https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-45444
5555731-2e92ae6bf759c0c5&q=1&e=b3fc70af-5d37-4ffb-b34d-9a51927f5f6d&u=
https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D1035
Bug ID: 1035
Summary: __rte_raw_cksum() crash with misaligned pointer
Product: DPDK
Version: 21.11
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: emil.b...@ericsson.com
Target Milestone: ---
See rte_raw_cksum() in rte_ip.h, which is part of the public API. See
also the subfunction __rte_raw_cksum().
_rte_raw_cksum assumes that the buffer over which the checksum is
calculated is an even address (divisible by two). See for example this
stack overflow
post:
https://stackoverflow.com/questions/46790550/c-undefined-behavior-
strict-aliasing-rule-or-incorrect-alignment
The post explains that there is undefined behavior in C11 when
"conversion between two pointer types produces a result that is
incorrectly aligned". When the buf argument starts on an odd address
we thus have undefined behavior, since a pointer is cast from void* to
uint16_t*.
In most cases (at least on x86) that isn't a problem, but with higher
optimization levels it may break due to vector instructions. This new
function seems to be easier to optimize by the compiler, resulting in
a crash when the buf argument is odd. Please note that the undefined
behavior is present in earlier versions of dpdk as well.
Now you're probably thinking: "Just align your buffers". The problem
is that we have a packet buffer which is aligned. The checksum is
calculated on a subset of that aligned packet buffer, and that
sometimes lies on odd addresses.
The question remains if this is an issue with dpdk or not.
I can imagine other systems doing what you describe too. So it needs to be
addressed.
Off the top of my head, an easy fix would be updating __rte_raw_cksum() like
this:
static inline uint32_t
__rte_raw_cksum(const void *buf, size_t len, uint32_t sum) {
if (likely((buf & 1) == 0)) {
/* The buffer is 16 bit aligned. */
Keep the existing, optimized implementation here.
} else {
/* The buffer is not 16 bit aligned. */
Add a new odd-buf tolerant implementation here.
}
}
However, I'm not sure that it covers your scenario!
The checksum is 16 bit wide, so if you calculate the checksum of e.g. 4 bytes
of memory starting at offset 1 in a 6 byte packet buffer, the memory block can
be treated as either 4 or 6 bytes relative to the data covered by the checksum,
i.e.:
A: XX [01 02] [03 04] XX --> cksum = [04 06]
B: [XX 01] [02 03] [04 XX] --> cksum = [06 04]
Which one do you need?
Perhaps an additional function is required to support your use case, and the
documentation for rte_raw_cksum() and __rte_raw_cksum() needs to reflect that
the buffer must be 16 bit aligned.
Or the rte_raw_cksum() function can be modified to support an odd buffer
pointer as outlined above, with documentation added about alignment of the
running checksum.