https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Alexander Monakov from comment #3) > LLVM does a better job at code layout, and massively wins on the amount of > executed branches (in particular unconditional jumps). With > -fdisable-rtl-bbro gcc achieves a similar performance. -freorder-blocks-algorithm=simple seems to improve it there too. This does sound like all accidential really. Throwing some [[unlikely]] around seems to get "better" layout. I am I suspecting it all depends on the inputs and might not actually be a performance difference if the inputs are better "weighted" ....
