https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111143

--- Comment #5 from Paul Eggert <eggert at cs dot ucla.edu> ---
(In reply to Alexander Monakov from comment #4)

> To evaluate scheduling aspect, keep 'mov eax, 1' while changing 'add rbx,
> rax' to 'add rbx, 1'.

Adding the (unnecessary) 'mov eax, 1' doesn't affect the timing much, which is
what I would expect on a newer processor.

When I reran the benchmark on the same laptop (Intel i5-1335U), I got 3.289s
for GCC-generated code, 2.256s for the "38% faster" code (now it's 46% faster;
don't know why) and 2.260 s for the faster code with the unnecessary 'mov eax,
1' inserted.

Reply via email to