http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46943

           Summary: Unnecessary ZERO_EXTEND
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: ada
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: ja...@gcc.gnu.org
            Target: x86_64-linux


In http://blog.regehr.org/archives/320 Example 4
unsigned long long v;
unsigned short
foo (signed char x, unsigned short y)
{
  v = (unsigned long long) y;
  return (unsigned short) ((int) y / 3);
}
we emit a redundant zero-extension:
        movzwl  %si, %eax
        movq    %rax, v(%rip)
        movzwl  %si, %eax
        imull   $43691, %eax, %eax
        shrl    $17, %eax
        ret
The reason why the second movzwl %si, %eax wasn't CSEd with the first one is
because the first one is (set (reg:DI reg1) (zero_extend:DI (reg:HI reg2))
while the second one is (set (reg:SI reg3) (zero_extend:SI (reg:HI reg2))
Wonder if we can't teach CSE to optimize it (say that reg3 is actually
(subreg:SI (reg:DI reg1) 0), or if e.g. one of the zee/see passes (implicit-zee
e.g.) couldn't handle such cases.  Combiner can't do anything here, as there is
no data dependency, so try_combine won't see them together.

Reply via email to