https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by H.J. Lu <h...@gcc.gnu.org>: https://gcc.gnu.org/g:a0b4e09ab0102d9c0a5e6d603a080cb78600c40d commit r12-524-ga0b4e09ab0102d9c0a5e6d603a080cb78600c40d Author: Ivan Sorokin <vanya...@gmail.com> Date: Mon May 3 11:39:48 2021 +0300 x86: Build only one __cpu_model/__cpu_features2 variables GCC -O2 generated quite bad code for this function: bool f (void) { return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3"); } f: movl __cpu_model+12(%rip), %edx movl %edx, %eax shrl $6, %eax andl $1, %eax andl $4, %edx movl $0, %edx cmove %edx, %eax ret The problem was caused by the fact that internally every invocation of __builtin_cpu_supports built a new variable __cpu_model and a new type __processor_model. Because of this, GIMPLE level optimizers weren't able to CSE the loads of __cpu_model and optimize bit-operations properly. Improve GCC -O2 code generation by caching __cpu_model and__cpu_features2 variables as well as their types: f: movl __cpu_model+12(%rip), %eax andl $68, %eax cmpl $68, %eax sete %al ret 2021-05-05 Ivan Sorokin <vanya...@gmail.com> H.J. Lu <hjl.to...@gmail.com> gcc/ PR target/91400 * config/i386/i386-builtins.c (ix86_cpu_model_type_node): New. (ix86_cpu_model_var): Likewise. (ix86_cpu_features2_type_node): Likewise. (ix86_cpu_features2_var): Likewise. (fold_builtin_cpu): Cache __cpu_model and __cpu_features2 with their types. gcc/testsuite/ PR target/91400 * gcc.target/i386/pr91400-1.c: New test. * gcc.target/i386/pr91400-2.c: Likewise.