https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014

            Bug ID: 102014
           Summary: [missed optimization] __uint128_t % uint64_t emits a
                    call to __umodti3 instead of div instruction
           Product: gcc
           Version: 11.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kamkaz at windowslive dot com
  Target Milestone: ---

The following code:

    #include <stdint.h>
    extern u64 safe_mul(uint64_t a, uint64_t b, uint64_t n) {
        return (((__uint128_t)a)*b)%n;
    }

compiled with -O2 for x86_64 architecture generates following assembly:

    safe_mul(unsigned long, unsigned long, unsigned long):
        mov     rax, rdi
        mov     r8, rdx
        sub     rsp, 8
        xor     ecx, ecx
        mul     rsi
        mov     rsi, rdx
        mov     rdi, rax
        mov     rdx, r8
        call    __umodti3
        add     rsp, 8
        ret

With call to __umodti3, while it could compiled to:

    safe_mul(unsigned long, unsigned long, unsigned long):
        mov     rax, rdx
        mul     rcx
        div     r8
        mov     rax, rdx
        ret

The same thing happens with division __uint128_t / uint64_t and unnecessary
call to __udivti3 instead of div instruction.

Reply via email to