https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65871
Bug ID: 65871 Summary: bzhi builtin/intrinsic wrongly assumes bzhi instruction doesn't set the ZF flag Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jamrial at gmail dot com unsigned foo(void); int main(void) { if (__builtin_ia32_bzhi_si(foo(), foo())) return 1; return 0; } Compiled with -mbmi2 -O3 0000000000000000 <main>: 0: 53 push rbx 1: e8 00 00 00 00 call 6 <main+0x6> 6: 89 c3 mov ebx,eax 8: e8 00 00 00 00 call d <main+0xd> d: c4 e2 60 f5 c0 bzhi eax,eax,ebx 12: 85 c0 test eax,eax 14: 0f 95 c0 setne al 17: 0f b6 c0 movzx eax,al 1a: 5b pop rbx 1b: c3 ret It generates a redundant test instruction. According to http://www.felixcloutier.com/x86/BZHI.html bzhi already sets the ZF flag on its own. Same happens when using inline assembly instead of the builtin to generate the bzhi instruction. In all cases reproducible with GCC 4.9.2 and GCC 5.1.0. Didn't test the 4.8 branch or trunk. This aside, it would be nice if gcc could generate a bzhi instruction on its own if it detects "X & ((1 << Y) - 1)" where Y is not a constant, same as it does for several other bmi and tbm instructions, instead of needing to use the builtin (Which is only available when targeting bmi2). I can open a new bug report for that if needed.