https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94860
            Bug ID: 94860
           Summary: Failure to recognize bzhi pattern
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

uint32_t bzhi32(uint32_t x, uint32_t y)
{
  return ((x << (32 - y)) >> (32 - y));
}

LLVM with -O3 -mbmi2 optimizes this to : 

bzhi32(unsigned int, unsigned int): # @bzhi32(unsigned int, unsigned int)
  bzhi eax, edi, esi
  ret

GCC outputs this :

bzhi32(unsigned int, unsigned int):
  mov eax, 32
  sub eax, esi
  shlx edi, edi, eax
  shrx eax, edi, eax
  ret

It should be optimized down to bzhi.

This optimization can be applied to :
- x86-64 (with bzhi)
- i686 (with bzhi)
- AMDGCN (with v_bfe_u32)

Reply via email to