https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014
Bug ID: 102014 Summary: [missed optimization] __uint128_t % uint64_t emits a call to __umodti3 instead of div instruction Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: kamkaz at windowslive dot com Target Milestone: --- The following code: #include <stdint.h> extern u64 safe_mul(uint64_t a, uint64_t b, uint64_t n) { return (((__uint128_t)a)*b)%n; } compiled with -O2 for x86_64 architecture generates following assembly: safe_mul(unsigned long, unsigned long, unsigned long): mov rax, rdi mov r8, rdx sub rsp, 8 xor ecx, ecx mul rsi mov rsi, rdx mov rdi, rax mov rdx, r8 call __umodti3 add rsp, 8 ret With call to __umodti3, while it could compiled to: safe_mul(unsigned long, unsigned long, unsigned long): mov rax, rdx mul rcx div r8 mov rax, rdx ret The same thing happens with division __uint128_t / uint64_t and unnecessary call to __udivti3 instead of div instruction.