Re: a question about IP checksum helper for arm64

Bo Yan Mon, 09 Jul 2018 12:54:45 -0700

Hi Robin,

That UBSAN error prompted me to check the generated instructions. Theerror by itself doesn't make sense to me because there is no requirementfor 128b alignment on ldp/stp.

With 4.18-rc3, when I build for the default "defconfig" inarch/arm64/configs/, I see the disassembled code shows two ldr insteadof a ldp.

One example is in function "ip_rcv" (/net/ipv4/ip_input.c). Afterdisassembling vmlinux, I see following:


        if (unlikely(ip_fast_csum((u8 *)iph, iph->ihl)))
ffff0000089bcdb8:       38776b05        ldrb    w5, [x24,x23]
static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
{
        __uint128_t tmp;
        u64 sum;

        tmp = *(const __uint128_t *)iph;
ffff0000089bcdbc:       f9400481        ldr     x1, [x4,#8]
ffff0000089bcdc0:       f8776b00        ldr     x0, [x24,x23]
ffff0000089bcdc4:       12000ca5        and     w5, w5, #0xf
ffff0000089bcdc8:       510014a3        sub     w3, w5, #0x5

This is done with "make ARCH=arm64 CROSS_COMPILE=... defconfig", so thedefault optimization level is -O2.

I tried the same test as you did: aarch64-linux-gnu-objdump -S -d *.oin net/ipv4. The result is inconsistent. In some instances, I do see ldpinstruction being generated, in some other cases, it's two ldr. Forexample, in inet_gro_receive and ip_mc_check_igmp, it's compiled as Iexpected. For ip_rcv, it's not.

So it looks like this is not very consistent, but it also looks like inmajority of cases it generates the ldp instructions.




On 07/09/2018 04:54 AM, Robin Murphy wrote:

Hi Bo,

On 06/07/18 17:27, Bo Yan wrote:
Hi Robin, Luke,
Recently I bumped into an error when running GCC undefined behaviorsanitizer:
UBSAN: Undefined behaviour inkernel-4.9/arch/arm64/include/asm/checksum.h:34:6 load of misaligned address ffffffc198c8b254 for type 'const__int128 unsigned'
        which requires 16 byte alignment
What's your config and reproducer here? I've had UBSan enabled a fewtimes since that patch went in and never noticed anything. I've justtried it with 4.18-rc3 and indeed don't see anything from just bootingthe machine and making some network traffic. It does indeed fire if Ialso turn on CONFIG_UBSAN_ALIGNMENT, but then it's almost lost among amillion other warnings for all manner of types - that's to be expectedsince, as the help text says, "Enabling this option on architecturesthat support unaligned accesses may produce a lot of false positives."
The relevant code:

         tmp = *(const __uint128_t *)iph;
         iph += 16;
         ihl -= 4;
         tmp += ((tmp >> 64) | (tmp << 64));
         sum = tmp >> 64;
         do {
                 sum += *(const u32 *)iph;
                 iph += 4;
         } while (--ihl);
But, I checked the generated disassembly, it doesn't look likeanything special is generated taking advantage of that.
I'm using Linaro GCC 6.4-2017.08, expecting ldp instructions to beemitted, but don't see it.
My regular toolchain is currently Linaro 7.2.1-2017.11, but I also triedthe last GCC 6 I had installed (6.3.1-2017.05), and for both at -O2 Isee LDP emitted as expected for most of the identifiable int128 accesses(both in a standalone test harness and a quick survey of kernel code via'aarch64-linux-gnu-objdump -S net/ipv4/*.o'). Of course, there may wellbe places where the compiler gets clever enough to elide all or part ofthat load where data is already held in registers - I've not audited*that* closely - but the whole point of having a pure C implementationis that it can be aggressively inlined more than inline asm ever could.
There were some prior discussions about GCC behavior, like thisthread: https://patchwork.kernel.org/patch/9081911/ , in which youtalked about the difference between GCC4 and GCC5.3. It looks to methis is regressed in Linaro GCC6.4 build.
I have not checked newer GCC versions.
Will it be more stable to just do this with inline assembly instead ofrelying on __uint128_t data type?
GCC documentation says __int128 is supported for targets which have aninteger mode wide enough to hold 128 bits. aarch64 doesn't have suchan integer mode.
Yet AArch64 GCC definitely does support __uint128_t, or this codewouldn't even build ;)
Robin.
Thanks

Bo

Re: a question about IP checksum helper for arm64

Reply via email to