On Oct 27, 2020, Richard Biener <richard.guent...@gmail.com> wrote: > For trapping math SRC may be a constant? Better be safe > and guard against TREE_CODE (src) != SSA_NAME.
*nod* > You also want to guard against SSA_NAME_OCCURS_IN_ABNORMAL_PHI (src) > since you cannot generally propagate or move uses of those. What if I don't propagate or move them? In my first cut at this change, I figured it (looked like it) took half a brain to implement it, and shut down the other half, only making minor tweaks to a copy of execute_cse_sincos_1. Your response was a wake-up call to many issues I hadn't considered but should have, and that it looks like sincos doesn't either, so I ended up rewriting the cse_conv function to use and propagate a preexisting dominating def, instead of inserting one at a point where there wasn't one. That's because any such conv might trap, unless an earlier one didn't. > Any reason you are not replacing all uses via replace_uses_by > and removing the old conversion stmts? OK, removing might > wreck the upthread iterator so replacing with GIMPLE_NOP > is the usual trick then. Removing them would be fine, I was just thoughtlessly mirroring the cse_sincos_1 behavior that I'd based it on. Now, I put in code to leave conversion results alone if SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs), but I wonder if it would work to replace uses thereof, or to propagate uses thereof, as long as I turned such conversion results that would be deleted into copies. I.e., given: _2 = (double) _1; _4 = sin (_2); ... _3 = (double) _1; // occurs in abnormal phi below _5 = cos (_3); ... _6 = PHI <..., _3 (abnormal), ...>; Wouldn't this be ok to turn into: _2 = (double) _1; _4 = sin (_2); ... _3 = _2; _5 = cos (_2); ... _6 = PHI <..., _3 (abnormal), ...>; ? Anyway, here's a patch that does *not* assume it would work, and skips defs used in abnormal phis instead. sincos, however, may mess with them, and even introduce/move a trapping call to a place where there wasn't any, even if none of the preexisting calls were going to be executed. The only thing I fixed there was a plain return within a FOR_EACH_IMM_USE_STMT that I introduced recently. Though it's unlikely to ever hit, I understand it's wrong per the current API. BTW, any reason why we are not (yet?) using something like: #define FOR_EACH_IMM_USE_STMT(STMT, ITER, SSAVAR) \ for (auto_end_imm_use_stmt_traverse auto_end \ ((((STMT) = first_imm_use_stmt (&(ITER), (SSAVAR))), \ &(ITER))); \ !end_imm_use_stmt_p (&(ITER)); \ (void) ((STMT) = next_imm_use_stmt (&(ITER)))) Anyway, here's what I'm testing now. Bootstrap succeeded, regression testing underway. Ok to install if it succeeds? CSE conversions within sincos From: Alexandre Oliva <ol...@adacore.com> On platforms in which Aux_[Real_Type] involves non-NOP conversions (e.g., between single- and double-precision, or between short float and float), the conversions before the calls are CSEd too late for sincos to combine calls. This patch enables the sincos pass to CSE type casts used as arguments to eligible calls before looking for other calls using the same operand. for gcc/ChangeLog * tree-ssa-math-opts.c (sincos_stats): Add conv_removed. (execute_cse_conv_1): New. (execute_cse_sincos_1): Call it. Fix return within FOR_EACH_IMM_USE_STMT. (pass_cse_sincos::execute): Report conv_inserted. for gcc/testsuite/ChangeLog * gnat.dg/sin_cos.ads: New. * gnat.dg/sin_cos.adb: New. * gcc.dg/sin_cos.c: New. --- gcc/testsuite/gcc.dg/sin_cos.c | 41 ++++++++++++++ gcc/testsuite/gnat.dg/sin_cos.adb | 14 +++++ gcc/testsuite/gnat.dg/sin_cos.ads | 4 + gcc/tree-ssa-math-opts.c | 107 +++++++++++++++++++++++++++++++++++++ 4 files changed, 165 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/sin_cos.c create mode 100644 gcc/testsuite/gnat.dg/sin_cos.adb create mode 100644 gcc/testsuite/gnat.dg/sin_cos.ads diff --git a/gcc/testsuite/gcc.dg/sin_cos.c b/gcc/testsuite/gcc.dg/sin_cos.c new file mode 100644 index 00000000..aa71dca --- /dev/null +++ b/gcc/testsuite/gcc.dg/sin_cos.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* This maps to essentially the same gimple that is generated for + gnat.dg/sin_cos.adb, on platforms that use the wraplf variant of + Ada.Numerics.Aux_Float. The value of EPSILON is not relevant to + the test, but the test must be there to keep the conversions in + different BBs long enough to trigger the problem that prevented the + sincos optimization, because the arguments passed to sin and cos + didn't get unified into a single SSA_NAME in time for sincos. */ + +#include <math.h> + +#define EPSILON 3.4526697709225118160247802734375e-4 + +static float my_sinf(float x) { + return (float) sin ((double) x); +} + +static float wrap_sinf(float x) { + if (fabs (x) < EPSILON) + return 0; + return my_sinf (x); +} + +static float my_cosf(float x) { + return (float) cos ((double) x); +} + +static float wrap_cosf(float x) { + if (fabs (x) < EPSILON) + return 1; + return my_cosf (x); +} + +float my_sin_cos(float x, float *s, float *c) { + *s = wrap_sinf (x); + *c = wrap_cosf (x); +} + +/* { dg-final { scan-assembler "sincos\|cexp" { target *-linux-gnu* *-w64-mingw* *-*-vxworks* } } } */ diff --git a/gcc/testsuite/gnat.dg/sin_cos.adb b/gcc/testsuite/gnat.dg/sin_cos.adb new file mode 100644 index 00000000..6e18df9 --- /dev/null +++ b/gcc/testsuite/gnat.dg/sin_cos.adb @@ -0,0 +1,14 @@ +-- { dg-do compile } +-- { dg-options "-O2 -gnatn" } + +with Ada.Numerics.Elementary_Functions; +use Ada.Numerics.Elementary_Functions; +package body Sin_Cos is + procedure Sin_Cos (Angle : T; SinA, CosA : out T) is + begin + SinA := Sin (Angle); + CosA := Cos (Angle); + end; +end Sin_Cos; + +-- { dg-final { scan-assembler "sincos\|cexp" { target *-linux-gnu* *-w64-mingw* *-*-vxworks* } } } diff --git a/gcc/testsuite/gnat.dg/sin_cos.ads b/gcc/testsuite/gnat.dg/sin_cos.ads new file mode 100644 index 00000000..a0eff3d --- /dev/null +++ b/gcc/testsuite/gnat.dg/sin_cos.ads @@ -0,0 +1,4 @@ +package Sin_Cos is + subtype T is Float; + procedure Sin_Cos (Angle : T; SinA, CosA : out T); +end Sin_Cos; diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 90dfb98..65c7b34 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -187,6 +187,10 @@ static struct { /* Number of cexpi calls inserted. */ int inserted; + + /* Number of conversions removed. */ + int conv_removed; + } sincos_stats; static struct @@ -1099,6 +1103,103 @@ make_pass_cse_reciprocals (gcc::context *ctxt) return new pass_cse_reciprocals (ctxt); } +/* If NAME is the result of a type conversion, look for other + equivalent dominating or dominated conversions, and replace all + uses with the earliest dominating name, removing the redundant + conversions. Return the prevailing name. */ + +static tree +execute_cse_conv_1 (tree name) +{ + if (SSA_NAME_IS_DEFAULT_DEF (name) + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name)) + return name; + + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + + if (!gimple_assign_cast_p (def_stmt)) + return name; + + tree src = gimple_assign_rhs1 (def_stmt); + + if (TREE_CODE (src) != SSA_NAME) + return name; + + imm_use_iterator use_iter; + gimple *use_stmt; + + /* Find the earliest dominating def. */ + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, src) + { + if (use_stmt == def_stmt + || !gimple_assign_cast_p (use_stmt)) + continue; + + tree lhs = gimple_assign_lhs (use_stmt); + + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs) + || (gimple_assign_rhs1 (use_stmt) + != gimple_assign_rhs1 (def_stmt)) + || !types_compatible_p (TREE_TYPE (name), TREE_TYPE (lhs))) + continue; + + bool use_dominates; + if (gimple_bb (def_stmt) == gimple_bb (use_stmt)) + { + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + while (!gsi_end_p (gsi) && gsi_stmt (gsi) != def_stmt) + gsi_next (&gsi); + use_dominates = !gsi_end_p (gsi); + } + else if (dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), + gimple_bb (def_stmt))) + use_dominates = false; + else if (dominated_by_p (CDI_DOMINATORS, gimple_bb (def_stmt), + gimple_bb (use_stmt))) + use_dominates = true; + else + continue; + + if (use_dominates) + { + std::swap (name, lhs); + std::swap (def_stmt, use_stmt); + } + } + + /* Now go through all uses of SRC again, replacing the equivalent + dominated conversions. We may replace defs that were not + dominated by the then-prevailing defs when we first visited + them. */ + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, src) + { + if (use_stmt == def_stmt + || !gimple_assign_cast_p (use_stmt)) + continue; + + tree lhs = gimple_assign_lhs (use_stmt); + + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs) + || (gimple_assign_rhs1 (use_stmt) + != gimple_assign_rhs1 (def_stmt)) + || !types_compatible_p (TREE_TYPE (name), TREE_TYPE (lhs))) + continue; + + if (gimple_bb (def_stmt) == gimple_bb (use_stmt) + || dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), + gimple_bb (def_stmt))) + { + sincos_stats.conv_removed++; + + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + replace_uses_by (lhs, name); + gsi_remove (&gsi, true); + } + } + + return name; +} + /* Records an occurrence at statement USE_STMT in the vector of trees STMTS if it is dominated by *TOP_BB or dominates it or this basic block is not yet initialized. Returns true if the occurrence was pushed on @@ -1147,6 +1248,8 @@ execute_cse_sincos_1 (tree name) int i; bool cfg_changed = false; + name = execute_cse_conv_1 (name); + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, name) { if (gimple_code (use_stmt) != GIMPLE_CALL @@ -1181,7 +1284,7 @@ execute_cse_sincos_1 (tree name) and, in subsequent rounds, that the built_in type is the same type, or a compatible type. */ if (type != t && !types_compatible_p (type, t)) - return false; + RETURN_FROM_IMM_USE_STMT (use_iter, false); } if (seen_cos + seen_sin + seen_cexpi <= 1) return false; @@ -2296,6 +2399,8 @@ pass_cse_sincos::execute (function *fun) statistics_counter_event (fun, "sincos statements inserted", sincos_stats.inserted); + statistics_counter_event (fun, "conv statements removed", + sincos_stats.conv_removed); return cfg_changed ? TODO_cleanup_cfg : 0; } -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer