s Hi!
On Thu, May 08, 2025 at 07:44:26PM -0400, Michael Meissner wrote: > In bug PR target/118541 on power9, power10, and power11 systems, for the > function: > > extern double __ieee754_acos (double); > > double > __acospi (double x) > { > double ret = __ieee754_acos (x) / 3.14; > return __builtin_isgreater (ret, 1.0) ? 1.0 : ret; > } > > GCC currently generates the following code: > > Power9 Power10 and Power11 > ====== =================== > bl __ieee754_acos bl __ieee754_acos@notoc > nop plfd 0,.LC0@pcrel > addis 9,2,.LC2@toc@ha xxspltidp 12,1065353216 > addi 1,1,32 addi 1,1,32 > lfd 0,.LC2@toc@l(9) ld 0,16(1) > addis 9,2,.LC0@toc@ha fdiv 0,1,0 > ld 0,16(1) mtlr 0 > lfd 12,.LC0@toc@l(9) xscmpgtdp 1,0,12 > fdiv 0,1,0 xxsel 1,0,12,1 > mtlr 0 blr > xscmpgtdp 1,0,12 > xxsel 1,0,12,1 > blr > > This is because ifcvt.c optimizes the conditional floating point move to use > the > XSCMPGTDP instruction. No. This is because we erroneously use the xscmpgtodp insn. It is not an optimisation; it is a bug. > However, the XSCMPGTDP instruction will generate an interrupt if one of the > arguments is a signalling NaN and signalling NaNs can generate an interrupt. Most FP insns signal if given a signaling NaN. That is not the problem here. This insn however raises the "invalid compare" exception whenever there is a NaN as input (quiet or signaling). That is not what isgreater is supposed to do. > The IEEE comparison functions (isgreater, etc.) require that the comparison > not > raise an interrupt. There may or may not be an interrupt anyway. Probably not, VE=0 typically. But there is an exception, yes. > The following patch changes the PowerPC back end so that ifcvt.c will not > change > the if/then test and move into a conditional move if the comparison is one of > the comparisons that do not raise an error with signalling NaNs and -Ofast is > not used. If a normal comparison is used or -Ofast is used, GCC will continue > to generate XSCMPGTDP and XXSEL. And that is still wrong. We do not want to generate code that does the wrong thing unless the -ffinite-math-only promise is miraculously upheld by the program and its input data, for absolutely no gain nonetheless. > with the following patch, GCC generates the following for power9, power10, and > power11: > > ordered_compare: > fcmpu 0,1,2 > fmr 1,4 > bnglr 0 > fmr 1,3 > blr > > normal_compare: > xscmpgtdp 1,1,2 > xxsel 1,4,3,1 > blr fcmpu is called "unordered compare" in the Power ISA btw :-) It probably is wise to avoid the "ordered" and "unordered" names, they mean different (sometime contradictory!) things in different contexts. Most often you talk about an fp comparison result to be "ordered", i.e. lt or gt; this is the naming GCC uses as well btw. > ;; comparisons that generate a 0/-1 mask (i.e. the inverse of > ;; fpmask_comparison_operator). > +;; > +;; invert_fpmask_comparison_operator is used to form floating point > conditional > +;; moves on power9. The instructions that would be generated (xscmpeqdp, > +;; xscmpgtdp, or xscmpgedp) will raise an error if one of the arguments is a > +;; signalling NaN. Don't allow the test to be inverted if NaNs are supported > +;; and the comparison is an ordered comparison. It will raise an exception (not an interrupt, not an "error" (whatever that may mean)), if any input is *any* NaN. > (define_predicate "invert_fpmask_comparison_operator" > - (match_code "ne,unlt,unle")) > + (ior (match_code "ne") > + (and (match_code "unlt,unle") > + (match_test "flag_finite_math_only")))) This is plain wrong. NAK. > +/* Whether we can reverse the sense of an ordered (UNLT, UNLE, UNGT, UNGE, > + UNEQ, or LTGT) comparison. If we are doing floating point conditional > moves > + on power9 and above, we cannot convert an ordered comparison to unordered, > + since the instructions (XSCMP{EQ,GT,GE}DP) that are used for conditional > + moves can trap if an argument is a signalling NaN. However for normal > jumps > + we can reverse a comparison since we only use unordered compare > instructions > + which do not trap on signalling NaNs. */ > + > +enum class rev_cond_ordered { > + ordered_ok, > + no_ordered > +}; As is this; the whole *concept* here makes no sense at all. > /* Can the condition code MODE be safely reversed? This is safe in > all cases on this port, because at present it doesn't use the > - trapping FP comparisons (fcmpo). */ > + trapping FP comparisons (fcmpo). > + > + However, this is not safe for ordered comparisons (i.e. for isgreater, > etc.) > + starting with the power9 because ifcvt.cc will want to create a fp cmove, > + and the x{s,v}cmp{eq,gt,ge}{dp,qp} instructions will trap if one of the > + arguments is a signalling NaN. */ > #define REVERSIBLE_CC_MODE(MODE) 1 Those insns *are* ordered comparison insns (in the Power ISA sense), and they are unwanted for all the same reasons as we do not want the simpler similar insns. So please make a patch that instead of unnecessarily and wrongly changing the fundamentals of how we handle FP comparisons in our backend, just fixes the one pattern where we already messed up? Segher