https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59429
--- Comment #16 from Mathias Stearn <redbeard0531 at gmail dot com> --- Trunk still generates different code for all cases (in some cases subtly so) for both aarch64 and x86_64: https://www.godbolt.org/z/1s6sxrMWq. For both arches, it seems like LE and LG generate the best code. On aarch64, they probably all have the same throughput, but EL and EG have a size penalty with an extra instruction. On x86_64, there is much more variety. EL and EG both get end up with a branch rather than being branchless, which is probably bad since comparison functions are often called in ways that the result branches are unpredictable. GE and GL appear to have regressed since this ticket was created. They now do the comparison twice rather than reusing the flags from the first comparison: comGL(int, int): xor eax, eax cmp edi, esi mov edx, 1 setl al neg eax cmp edi, esi cmovg eax, edx ret comGE(int, int): xor eax, eax cmp edi, esi mov edx, 1 setne al neg eax cmp edi, esi cmovg eax, edx ret