https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94860
Bug ID: 94860 Summary: Failure to recognize bzhi pattern Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- uint32_t bzhi32(uint32_t x, uint32_t y) { return ((x << (32 - y)) >> (32 - y)); } LLVM with -O3 -mbmi2 optimizes this to : bzhi32(unsigned int, unsigned int): # @bzhi32(unsigned int, unsigned int) bzhi eax, edi, esi ret GCC outputs this : bzhi32(unsigned int, unsigned int): mov eax, 32 sub eax, esi shlx edi, edi, eax shrx eax, edi, eax ret It should be optimized down to bzhi. This optimization can be applied to : - x86-64 (with bzhi) - i686 (with bzhi) - AMDGCN (with v_bfe_u32)