On Tue, Aug 15, 2017 at 4:59 PM, Richard Biener <richard.guent...@gmail.com> wrote:
> So I'd try the "easy" way of expanding if (__builtin_cpu_supports ("sse4.1")) > as the sse4.1 sequence is just a single instruction. The interesting part > of the story will be to make sure we can emit that even if ! TARGET_ROUND ... > > Uros, any idea how to accomplish this? Or is the idea of a "local" ifunc > better? Note the ABI boundary will be expensive but I guess the conditional > sequence as well (and it will disturb RA even if predicted to have SSE 4.1). TARGET_ROUND is just: /* SSE4.1 defines round instructions */ #define OPTION_MASK_ISA_ROUND OPTION_MASK_ISA_SSE4_1 #define TARGET_ISA_ROUND ((ix86_isa_flags & OPTION_MASK_ISA_ROUND) != 0) I don't remember the history around the #define, once upon a time probably made sense, but nowadays it looks that it can be simply substituted with TARGET_SSE4_1. Uros.