On Thu, Apr 13, 2017 at 11:33:12AM +0000, Wilco Dijkstra wrote: > Jakub Jelinek wrote: > > > No. Some constants sometimes even 7 instructions (e.g. sparc64; not talking > > in particular about 1ULL << 63 constant), or have one instruction > > that is more expensive than normal small constant load. Compare say x86_64 > > movl/movq vs. movabsq, I think the latter has 3 times longer latency on many > > CPUs. So no, I think it isn't an unconditional win. > > We're specifically only talking about the constants (1L << 63), (1 << 31) and > (1 << 15). > On all targets these need at most 2 simple instructions. That makes it an > unconditional win.
It is not a win on at least Haswell-E: __attribute__((noinline, noclone)) unsigned long long int foo (int x) { asm volatile ("" : : : "memory"); return 1ULL << (63 - x); } __attribute__((noinline, noclone)) unsigned long long int bar (int x) { asm volatile ("" : : : "memory"); return (1ULL << 63) >> x; } int main (int argc, const char **argv) { int i; if (argc == 1) for (i = 0; i < 1000000000; i++) asm volatile ("" : : "r" (foo (13))); else for (i = 0; i < 1000000000; i++) asm volatile ("" : : "r" (bar (13))); return 0; } $ time /tmp/test real 0m1.290s user 0m1.288s sys 0m0.002s $ time /tmp/test 1 real 0m1.542s user 0m1.540s sys 0m0.002s As I said, movabsq is expensive. Jakub