For general ccmp scenario, the tree sequence is like

_1 = (a < b)
_2 = (c < d)
_3 = _1 & _2

current ccmp expanding will try to swap compare order for _1 and _2,
compare the cost/cost2 between compare _1 and _2 first, then return the
sequence with lower cost.

For x86 ccmp, we don't support FP compare as ccmp operand, but we
support fp comi + int ccmp sequence. With current cost comparison
model, the fp comi + int ccmp can never be generated since it doesn't
check whether expand_ccmp_next returns available result and the rtl
cost for the empty ccmp sequence is always smaller.

Check the expand_ccmp_next result ret and ret2, returns the valid one
before cost comparison.

gcc/ChangeLog:

        * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of
        expand_ccmp_next, returns the valid one first before
        comparing cost.
---
 gcc/ccmp.cc | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc
index 7cb525addf4..4b424220068 100644
--- a/gcc/ccmp.cc
+++ b/gcc/ccmp.cc
@@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, 
rtx_insn **gen_seq)
              cost2 = seq_cost (prep_seq_2, speed_p);
              cost2 += seq_cost (gen_seq_2, speed_p);
            }
-         if (cost2 < cost1)
+
+         /* For x86 target the ccmp does not support fp operands, but
+            have fcomi insn that can produce eflags and then do int
+            ccmp. So if one of the op is fp compare, ret1 or ret2 can
+            fail, and the cost of the corresponding empty seq will
+            always be smaller, then the NULL sequence will be returned.
+            Add check for ret and ret2, returns the available one if
+            the other is NULL.  */
+         if ((!ret && ret2)
+             || (!(ret && !ret2)
+                 && cost2 < cost1))
            {
              *prep_seq = prep_seq_2;
              *gen_seq = gen_seq_2;
-- 
2.31.1

Reply via email to