https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #3 from Petar Ivanov <dartdart26 at gmail dot com> --- Thank you for pointing the output on x86! Following that, I checked O2 and O3 on ARM64 and I see differences, though I cannot say what their actual impact is: 02: https://godbolt.org/z/P9Garznef O3: https://godbolt.org/z/Yb1q33YP3 In terms of x86, I ran the benchmark in Quick Bench (I assume x86 as that what the disassembly is) and the results are similar to my findings on ARM64 - move being slower: https://quick-bench.com/q/vK9eSYngutKGo4QSPcdra9gUOI0 The benchmark code seems correct to me, but I might be missing something, might be misusing DoNotOptimize() or there might be some side effects. I am sure this is not a big deal. I was just wondering if adding an if statement is doable and, if yes, it seems like a quick and easy win.