https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106323

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilco at gcc dot gnu.org

--- Comment #3 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> GCC might be better if the first bytes are in cache but the next bytes are
> not and then branch is predictable (which it might be).
> 
> So this is much more complex than just changing this really.

Neither sequence is efficient. Caches are not really relevant here, it's more
about giving a wide OoO core lots of useful parallel work to do, so avoiding
unnecessary instructions and branches that just slow you down. Hence 4 loads
and CMP+CCMP is best.

Reply via email to