https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87608
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> --- Note the compiler can evaluate the initialization loop and then also evaluate the effect of static_sort1 call, so the testcase might give misleading results. To avoid that, pass the address of 'a' to rdtsc, or introduce a compiler barrier with an asm: asm volatile ("" :: "r"(a) : "memory"); Furthermore, note that the CPU executes the rdtsc instruction without waiting for all preceding computations to complete. Using lfence just before rdtsc will ensure that rdtsc reads the cycle counter only after all preceding computations are done. On this testcase I think LLVM introduces ternary select operations in the IR fairly early and then works with straight-line code; in contrast, in GCC we scalarize the array and have a soup of BBs and phi nodes throughout gimple passes, which would be very hard to properly clean up on rtl.