On 1/7/2021 3:50 PM, Stephen Hemminger wrote:
On Wed, 6 Jan 2021 23:39:39 -0600
George Prekas <preka...@amazon.com> wrote:
On 1/6/2021 12:02 PM, Ferruh Yigit wrote:
On 12/5/2020 5:42 AM, George Prekas wrote:
Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c
and the calculated IP checksum is wrong on GCC 9 and GCC 10.
Signed-off-by: George Prekas <preka...@amazon.com>
---
v2:
* Instead of a compiler barrier, use a compiler flag.
---
app/test-pmd/meson.build | 1 +
1 file changed, 1 insertion(+)
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 7e9c7bdd6..5d24e807f 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -4,6 +4,7 @@
# override default name to drop the hyphen
name = 'testpmd'
cflags += '-Wno-deprecated-declarations'
+cflags += '-fno-strict-aliasing'
sources = files('5tswap.c',
'cmdline.c',
'cmdline_flow.c',
Hi George,
I am trying to understand this, the relevant code is as below:
ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr));
You are suspicious of strict aliasing rule violation, with more details:
The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const
unaligned_uint16_t *hdr", and compiler can optimize out the calculations using
data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the
data and compiler may think data is not changed at all.
1) But the pointer "hdr" is assigned in the loop, from another pointer whose
content is changing, why this is not helping to figure out that the data 'hdr'
pointing is changed.
2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()'
called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you
able to confirm the case with debug, or from the assembly/object file?
And if the issue is strict aliasing rule violation as you said, compiler flag is
an option but not sure how much it reduces the compiler optimization benefit, I
guess other options also not so good, memcpy brings too much work on runtime and
union requires bigger change and makes code complex.
I wonder if making 'ip_sum()' a non inline function can help, can you please
give a try since you can reproduce it?
Hi Ferruh,
Thanks for looking into it.
I am copy-pasting at the end of this email a minimal reproduction. It
calculates a checksum and prints it. The correct value is f8d9. If you compile
it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you
compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get
f8e8. You can also try it on https://godbolt.org/ and see how different
versions behave.
My understanding is that the code violates the C standard
(https://stackoverflow.com/a/99010).
--- cut here ---
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct rte_ipv4_hdr {
uint8_t version_ihl;
uint8_t type_of_service;
uint16_t total_length;
uint16_t packet_id;
uint16_t fragment_offset;
uint8_t time_to_live;
uint8_t next_proto_id;
uint16_t hdr_checksum;
uint32_t src_addr;
uint32_t dst_addr;
};
static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len)
{
uint32_t sum = 0;
while (hdr_len > 1)
{
sum += *hdr++;
if (sum & 0x80000000)
sum = (sum & 0xFFFF) + (sum >> 16);
hdr_len -= 2;
}
while (sum >> 16)
sum = (sum & 0xFFFF) + (sum >> 16);
return ~sum;
}
static void pkt_burst_flow_gen(void)
{
struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096);
memset(ip_hdr, 0, sizeof(*ip_hdr));
ip_hdr->version_ihl = 1;
ip_hdr->type_of_service = 2;
ip_hdr->fragment_offset = 3;
ip_hdr->time_to_live = 4;
ip_hdr->next_proto_id = 5;
ip_hdr->packet_id = 6;
ip_hdr->src_addr = 7;
ip_hdr->dst_addr = 8;
ip_hdr->total_length = 9;
ip_hdr->hdr_checksum = ip_sum((uint16_t *)ip_hdr, sizeof(*ip_hdr));
printf("%x\n", ip_hdr->hdr_checksum);
}
int main(void)
{
pkt_burst_flow_gen();
return 0;
}
If I change your code like this to use union, Gcc 10 is still broken.
This worked fine for me: https://godbolt.org/z/vdsxh9
It is a compiler bug. It maybe because optimizer is not smart enough
to know that memset has cleared the header.
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct rte_ipv4_hdr {
uint8_t version_ihl;
uint8_t type_of_service;
uint16_t total_length;
uint16_t packet_id;
uint16_t fragment_offset;
uint8_t time_to_live;
uint8_t next_proto_id;
uint16_t hdr_checksum;
uint32_t src_addr;
uint32_t dst_addr;
};
static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len)
{
uint32_t sum = 0;
while (hdr_len > 1)
{
sum += *hdr++;
if (sum & 0x80000000)
sum = (sum & 0xFFFF) + (sum >> 16);
hdr_len -= 2;
}
while (sum >> 16)
sum = (sum & 0xFFFF) + (sum >> 16);
return ~sum;
}
static void pkt_burst_flow_gen(void)
{
union {
struct rte_ipv4_hdr ip;
uint16_t data[10];
} *hdr;
hdr = malloc(sizeof(*hdr));
memset(hdr, 0, sizeof(*hdr));
hdr->ip.version_ihl = 1;
hdr->ip.type_of_service = 2;
hdr->ip.fragment_offset = 3;
hdr->ip.time_to_live = 4;
hdr->ip.next_proto_id = 5;
hdr->ip.packet_id = 6;
hdr->ip.src_addr = 7;
hdr->ip.dst_addr = 8;
hdr->ip.total_length = 9;
hdr->ip.hdr_checksum = ip_sum(hdr->data, sizeof(*hdr));
printf("%x\n", hdr->ip.hdr_checksum);
}
int main(void)
{
pkt_burst_flow_gen();
return 0;
}