https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102929

            Bug ID: 102929
           Summary: [missed optimization] two ways to
                    rounddown-to-next-multiple
           Product: gcc
           Version: 11.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jengelh at inai dot de
  Target Milestone: ---

Input
=====
unsigned long calc(unsigned long x, unsigned long y)
{
        return x/y*y;
}
unsigned long calc2(unsigned long x, unsigned long y)
{
        return x - x % y;
}

Observed
========
ยป g++ -O3 -c x.c; objdump -Mintel -d x.o
gcc version 11.2.1 20210816 [revision 056e324ce46a7924b5cf10f61010cf9dd2ca10e9]
(SUSE Linux) x86_64
0000000000000000 <_Z4calcmm>:
   0:   48 89 f8                mov    rax,rdi
   3:   31 d2                   xor    edx,edx
   5:   48 f7 f6                div    rsi
   8:   48 0f af c6             imul   rax,rsi
   c:   c3                      ret    
   d:   0f 1f 00                nop    DWORD PTR [rax]

0000000000000010 <_Z5calc2mm>:
  10:   48 89 f8                mov    rax,rdi
  13:   31 d2                   xor    edx,edx
  15:   48 f7 f6                div    rsi
  18:   48 89 f8                mov    rax,rdi
  1b:   48 29 d0                sub    rax,rdx
  1e:   c3                      ret    

Expected
========
I do not see any obvious differences in the outcome of the two C functions, so
I would expect that, ideally, both should lead to the same asm. (Either by
making calc use div-mov-sub, or by making calc2 using div-imul; whichever
happens to be determined more beneficial as per the machine descriptions).

Reply via email to