CC'd Richard for ccmp part as previously it is added only for aarch64. The original logic will not interrupted since if aarch64_gen_ccmp_first succeeded, aarch64_gen_ccmp_next will also success, the cmp/fcmp and ccmp/fccmp supports all GPI/GPF, and the prepare_operand will fixup the input that cmp supports but ccmp not, so ret/ret2 will all be valid when comparing cost. Thanks in advance.
Hongyu Wang <hongyu.w...@intel.com> 于2024年5月15日周三 16:22写道: > > For general ccmp scenario, the tree sequence is like > > _1 = (a < b) > _2 = (c < d) > _3 = _1 & _2 > > current ccmp expanding will try to swap compare order for _1 and _2, > compare the cost/cost2 between compare _1 and _2 first, then return the > sequence with lower cost. > > For x86 ccmp, we don't support FP compare as ccmp operand, but we > support fp comi + int ccmp sequence. With current cost comparison > model, the fp comi + int ccmp can never be generated since it doesn't > check whether expand_ccmp_next returns available result and the rtl > cost for the empty ccmp sequence is always smaller. > > Check the expand_ccmp_next result ret and ret2, returns the valid one > before cost comparison. > > gcc/ChangeLog: > > * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of > expand_ccmp_next, returns the valid one first before > comparing cost. > --- > gcc/ccmp.cc | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc > index 7cb525addf4..4b424220068 100644 > --- a/gcc/ccmp.cc > +++ b/gcc/ccmp.cc > @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, > rtx_insn **gen_seq) > cost2 = seq_cost (prep_seq_2, speed_p); > cost2 += seq_cost (gen_seq_2, speed_p); > } > - if (cost2 < cost1) > + > + /* For x86 target the ccmp does not support fp operands, but > + have fcomi insn that can produce eflags and then do int > + ccmp. So if one of the op is fp compare, ret1 or ret2 can > + fail, and the cost of the corresponding empty seq will > + always be smaller, then the NULL sequence will be returned. > + Add check for ret and ret2, returns the available one if > + the other is NULL. */ > + if ((!ret && ret2) > + || (!(ret && !ret2) > + && cost2 < cost1)) > { > *prep_seq = prep_seq_2; > *gen_seq = gen_seq_2; > -- > 2.31.1 >