Hi! When compiling: void foo (void); void bar (float a, float b) { if (__builtin_expect (a != b, 1)) foo (); } void baz (float a, float b) { if (__builtin_expect (a == b, 1)) foo (); } void qux (float a, float b) { if (__builtin_expect (a != b, 0)) foo (); } void corge (float a, float b) { if (__builtin_expect (a == b, 0)) foo (); } on x86_64, we get (unimportant cruft removed): bar: ucomiss %xmm1, %xmm0 jp .L4 je .L1 .L4: jmp foo .L1: ret baz: ucomiss %xmm1, %xmm0 jp .L6 jne .L6 jmp foo .L6: ret qux: ucomiss %xmm1, %xmm0 jp .L13 jne .L13 ret .L13: jmp foo corge: ucomiss %xmm1, %xmm0 jnp .L18 .L14: ret .L18: jne .L14 jmp foo (note for bar and qux that changed with a patch I've posted earlier today). This is all reasonable, except the last function, the overall jump to the tail call is predicted unlikely (10%), so it is good jmp foo isn't on the straight line path, but NaNs are (or should be) considered very unlikely in the programs, so IMHO the right code (and one emitted with the following patch) is: corge: ucomiss %xmm1, %xmm0 jp .L14 je .L18 .L14: ret .L18: jmp foo
When splitting conditions, we have previous original prob as probability of jumping to true label and then use a cprob set to 99% for ORDERED and 1% for UNORDERED. For !and_them, we end up with splitting if (x) goto true; // prob goto false; into: if (y) goto true; // prob * cprob (== first_prob) if (z) goto true; // adjusted prob goto false; with first_prob = prob.split (cprob); being the computation. For and_them, we instead split: if (x) goto true; // prob goto false; into: if (y) goto false; // 1 - first_prob if (z) goto true; // adjusted prob goto false; and first_prob being computed as: prob = prob.invert (); first_prob = prob.split (cprob).invert (); prob = prob.invert (); This sort of works if the initial probability is likely (larger than even), but with original prob being e.g. 10%, we compute first_prob as 10.9% - 1 - ((1 - 0.1) * 0.99) - as EQ splits into ORDERED && something, but we use first_prob on recursive call where we pass ORDERED and the false label and NULL true label. That means we predict the case where none of the arguments is NaN is only 10.9%, which is very unlikely. The following patch uses a different computation, taking into account that for the and_them case, if first_code is ORDERED that first_prob.invert () should be the probability of the inverted comparison and UNORDERED should be considered unlikely, so it uses first_prob = prob.split (cprob.invert ()).invert (); without the two prob = prob.invert (); around it and that seems to work fine for all the cases, but I'm not 100% sure if that is right. Or in the PR there is yet another variant that also seems to work. Bootstrapped/regtested on x86_64-linux and i686-linux. 2020-12-10 Jakub Jelinek <ja...@redhat.com> PR rtl-optimization/98212 * dojump.c (do_compare_rtx_and_jump): Change computation of first_prob for and_them and don't invert prob around it. * gcc.dg/predict-8.c: Adjust expected probability. --- gcc/dojump.c.jj 2020-12-09 15:11:17.042888002 +0100 +++ gcc/dojump.c 2020-12-09 20:05:59.535234206 +0100 @@ -1148,9 +1148,8 @@ do_compare_rtx_and_jump (rtx op0, rtx op if (and_them) { rtx_code_label *dest_label; - prob = prob.invert (); - profile_probability first_prob = prob.split (cprob).invert (); - prob = prob.invert (); + profile_probability first_prob + = prob.split (cprob.invert ()).invert (); /* If we only jump if true, just bypass the second jump. */ if (! if_false_label) { --- gcc/testsuite/gcc.dg/predict-8.c.jj 2020-01-12 11:54:37.506396918 +0100 +++ gcc/testsuite/gcc.dg/predict-8.c 2020-12-10 10:34:00.632123682 +0100 @@ -8,4 +8,4 @@ int foo(float a, float b) { return 2; } -/* { dg-final { scan-rtl-dump-times "65.\[34]. .guessed" 2 "expand"} } */ +/* { dg-final { scan-rtl-dump-times "99.\[345]. .guessed" 2 "expand"} } */ Jakub