On Tue, 11 Jul 2017 18:54:55 +0200 Maxime Ripard <maxime.rip...@free-electrons.com> wrote:
> Hi, > > I recently got a gcc 7.1 based toolchain, and it seems like it > generates unaligned code, specifically in the net_set_ip_header > function in my case. > > Whenever some packet is sent, this data abort is triggered: > > => setenv ipaddr 10.42.0.1; ping 10.42.0.254 > using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in > MAC de:ad:be:ef:00:01 > HOST MAC de:ad:be:af:00:00 > RNDIS ready > musb-hdrc: peripheral reset irq lost! > high speed config #2: 2 mA, Ethernet Gadget, using RNDIS > USB RNDIS network up! > Using usb_ether device > data abort > pc : [<7ff9db10>] lr : [<7ff9f00c>] > reloc pc : [<4a043b10>] lr : [<4a04500c>] > sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c > r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 > r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e > r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e > Flags: nZCv IRQs off FIQs off Mode SVC_32 > Resetting CPU ... > > > Running objdump on it gives us this: > > 4a043b04 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 > <net_set_ip_header+0x7c> > { > 4a043b08: e92d4013 push {r0, r1, r4, lr} > 4a043b0c: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043b10: e5803000 str r3, [r0] <---- Abort > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > ip->ip_id = htons(net_ip_id++); > 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 > <net_set_ip_header+0x80> void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt; /* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest); } This looks like a real bug in the U-Boot code. When we are casting from "uchar *pkt" to "struct ip_udp_hdr *ip", the pointer has to be properly aligned to the struct alignment. Then we need to refer to the AAPCS document for the size and alignment requirements used by ARM EABI. And it says the following: 4.3.1 Aggregates * The alignment of an aggregate shall be the alignment of its most-aligned component. * The size of an aggregate shall be the smallest multiple of its alignment that is sufficient to hold all of its members when they are laid out according to these rules. Basically, according to these rules, the alignment of the "ip_udp_hdr" must be 4 bytes because of the "in_addr" struct in it: /* IPv4 addresses are always 32 bits in size */ struct in_addr { __be32 s_addr; }; The __be32 typedef is somewhat tricky, because it has a "bitwise" attribute, but such attribute has no real meaning in GCC: https://gcc.gnu.org/ml/gcc-help/2008-03/msg00265.html > It seems like r0 is indeed set to an unaligned address (0x7ffed30e) > for some reason. Yes, if the caller passes an improperly aligned pointer, then undefined behaviour may happen. And 2 bytes alignment is not good enough for the "ip_udp_hdr" struct. Apparently GCC 7 tries to optimize the code by doing initialization of multiple u8/u16 fields as 32-bit writes. Because it rightfully expects proper 32-bit alignment for the structure. The 32-bit fields "ip_src" and "ip_dst" are not initialized in this function, so they did not trigger the data abort before. > Using a Linaro 6.3 toolchain works on the same commit with the same > config, so it really seems to be a compiler-related issue. Yes, GCC 6 was not smart enough to combine initialization of multiple u8/u16 fields with a single 32-bit write. > > It generates this code: > > 4a043ec4 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043ec4: e3a03045 mov r3, #69 ; 0x45 > { > 4a043ec8: e92d4013 push {r0, r1, r4, lr} > 4a043ecc: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043ed0: e5c03000 strb r3, [r0] > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 > ip->ip_tos = 0; > 4a043ed8: e3a00000 mov r0, #0 > ip->ip_len = htons(IP_HDR_SIZE); > 4a043edc: e1c430b2 strh r3, [r4, #2] > ip->ip_id = htons(net_ip_id++); > 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c > <net_set_ip_header+0x88> > > And it seems like it's using an strb instruction to avoid the > unaligned access. > > As far as I know, we are passing --wno-unaligned-access, so the broken > situation should not arise, and yet it does, so I'm a bit confused, > and not really sure what to do from there. The --wno-unaligned-access option does not help because the compiler assumes that the struct pointer is properly aligned. This bug can be fixed by either: 1) Always ensure proper alignment of the udp packet header pointers, which are passed to the net_set_ip_header() function and similar functions. Some further investigation is necessary. or 2) Just add a "packed" attribute to the ip_udp_hdr struct. In this case the compiler will always assume the smallest 1 byte alignment. It may have some performance implications though. -- Best regards, Siarhei Siamashka _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot