Tested on aarch64-linux-gnu, armeb-eabi and x86_64-linux-gnu. OK for gcc-9?
Richard
>From d170fb108c194ce04713c1ae95f9a107169f27f1 Mon Sep 17 00:00:00 2001 From: Richard Sandiford <richard.sandif...@arm.com> Date: Fri, 9 Aug 2019 09:37:55 +0000 Subject: [PATCH 1/2] Reject tail calls that read from an escaped RESULT_DECL [PR90313] In this PR we have two return paths from a function "map". The common code sets <result> to the value returned by one path, while the other path does: <retval> = map (&<retval>, ...); We treated this call as tail recursion, losing the copy semantics on the value returned by the recursive call. We'd correctly reject the same thing for variables: local = map (&local, ...); The problem is that RESULT_DECLs didn't get the same treatment. 2020-02-18 Richard Sandiford <richard.sandif...@arm.com> gcc/ Backport from mainline 2019-08-09 Richard Sandiford <richard.sandif...@arm.com> PR middle-end/90313 * tree-tailcall.c (find_tail_calls): Reject calls that might read from an escaped RESULT_DECL. gcc/testsuite/ PR middle-end/90313 * g++.dg/torture/pr90313.cc: New test. --- gcc/testsuite/g++.dg/torture/pr90313.cc | 33 +++++++++++++++++++++++++ gcc/tree-tailcall.c | 29 ++++++++++++++++++++++ 2 files changed, 62 insertions(+) create mode 100644 gcc/testsuite/g++.dg/torture/pr90313.cc diff --git a/gcc/testsuite/g++.dg/torture/pr90313.cc b/gcc/testsuite/g++.dg/torture/pr90313.cc new file mode 100644 index 00000000000..d9f183a2939 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr90313.cc @@ -0,0 +1,33 @@ +// { dg-do run } + +#include <stddef.h> + +namespace std { + template<typename T, size_t N> struct array { + T elems[N]; + const T &operator[](size_t i) const { return elems[i]; } + }; +} + +using Coordinates = std::array<double, 3>; + +Coordinates map(const Coordinates &c, size_t level) +{ + Coordinates result{ c[1], c[2], c[0] }; + + if (level != 0) + result = map (result, level - 1); + + return result; +} + +int main() +{ + Coordinates vecOfCoordinates = { 1.0, 2.0, 3.0 }; + + auto result = map(vecOfCoordinates, 1); + if (result[0] != 3 || result[1] != 1 || result[2] != 2) + __builtin_abort (); + + return 0; +} diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c index e0265b22dd5..255575f1198 100644 --- a/gcc/tree-tailcall.c +++ b/gcc/tree-tailcall.c @@ -479,6 +479,35 @@ find_tail_calls (basic_block bb, struct tailcall **ret) && !stmt_can_throw_external (cfun, stmt)) return; + /* If the function returns a value, then at present, the tail call + must return the same type of value. There is conceptually a copy + between the object returned by the tail call candidate and the + object returned by CFUN itself. + + This means that if we have: + + lhs = f (&<retval>); // f reads from <retval> + // (lhs is usually also <retval>) + + there is a copy between the temporary object returned by f and lhs, + meaning that any use of <retval> in f occurs before the assignment + to lhs begins. Thus the <retval> that is live on entry to the call + to f is really an independent local variable V that happens to be + stored in the RESULT_DECL rather than a local VAR_DECL. + + Turning this into a tail call would remove the copy and make the + lifetimes of the return value and V overlap. The same applies to + tail recursion, since if f can read from <retval>, we have to assume + that CFUN might already have written to <retval> before the call. + + The problem doesn't apply when <retval> is passed by value, but that + isn't a case we handle anyway. */ + tree result_decl = DECL_RESULT (cfun->decl); + if (result_decl + && may_be_aliased (result_decl) + && ref_maybe_used_by_stmt_p (call, result_decl)) + return; + /* We found the call, check whether it is suitable. */ tail_recursion = false; func = gimple_call_fndecl (call); -- 2.17.1
>From 31c4d04fe12bbc92fb8491a3e9d2da402a0963dc Mon Sep 17 00:00:00 2001 From: Richard Sandiford <richard.sandif...@arm.com> Date: Thu, 5 Dec 2019 14:20:38 +0000 Subject: [PATCH 2/2] Check for bitwise identity when encoding VECTOR_CSTs [PR92768] This PR shows that we weren't checking for bitwise-identical values when trying to encode a VECTOR_CST, so -0.0 was treated the same as 0.0 for -fno-signed-zeros. The patch adds a new OEP flag to select that behaviour. 2020-02-18 Richard Sandiford <richard.sandif...@arm.com> gcc/ Backport from mainline 2019-12-05 Richard Sandiford <richard.sandif...@arm.com> PR middle-end/92768 * tree-core.h (OEP_BITWISE): New flag. * fold-const.c (operand_compare::operand_equal_p): Handle it. * tree-vector-builder.h (tree_vector_builder::equal_p): Pass it. gcc/testsuite/ PR middle-end/92768 * gcc.dg/pr92768.c: New test. --- gcc/fold-const.c | 17 ++++++++++++++--- gcc/testsuite/gcc.dg/pr92768.c | 7 +++++++ gcc/tree-core.h | 3 ++- gcc/tree-vector-builder.h | 2 +- 4 files changed, 24 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr92768.c diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 9feaea9fde5..c717f24501e 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -2932,6 +2932,11 @@ combine_comparisons (location_t loc, If OEP_LEXICOGRAPHIC is set, then also handle expressions with side-effects such as MODIFY_EXPR, RETURN_EXPR, as well as STATEMENT_LISTs. + If OEP_BITWISE is set, then require the values to be bitwise identical + rather than simply numerically equal. Do not take advantage of things + like math-related flags or undefined behavior; only return true for + values that are provably bitwise identical in all circumstances. + Unless OEP_MATCH_SIDE_EFFECTS is set, the function returns false on any operand with side effect. This is unnecesarily conservative in the case we know that arg0 and arg1 are in disjoint code paths (such as in @@ -2978,6 +2983,11 @@ operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags) if (!TREE_TYPE (arg0) || !TREE_TYPE (arg1)) return 0; + /* Bitwise identity makes no sense if the values have different layouts. */ + if ((flags & OEP_BITWISE) + && !tree_nop_conversion_p (TREE_TYPE (arg0), TREE_TYPE (arg1))) + return 0; + /* We cannot consider pointers to different address space equal. */ if (POINTER_TYPE_P (TREE_TYPE (arg0)) && POINTER_TYPE_P (TREE_TYPE (arg1)) @@ -3110,8 +3120,7 @@ operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags) if (real_identical (&TREE_REAL_CST (arg0), &TREE_REAL_CST (arg1))) return 1; - - if (!HONOR_SIGNED_ZEROS (arg0)) + if (!(flags & OEP_BITWISE) && !HONOR_SIGNED_ZEROS (arg0)) { /* If we do not distinguish between signed and unsigned zero, consider them equal. */ @@ -3163,7 +3172,9 @@ operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags) break; } - if (flags & OEP_ONLY_CONST) + /* Don't handle more cases for OEP_BITWISE, since we can't guarantee that + two instances of undefined behavior will give identical results. */ + if (flags & (OEP_ONLY_CONST | OEP_BITWISE)) return 0; /* Define macros to test an operand from arg0 and arg1 for equality and a diff --git a/gcc/testsuite/gcc.dg/pr92768.c b/gcc/testsuite/gcc.dg/pr92768.c new file mode 100644 index 00000000000..e2a3f9ccacc --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr92768.c @@ -0,0 +1,7 @@ +/* PR tree-optimization/92768 */ +/* { dg-options "-O2 -fno-signed-zeros -fdump-tree-optimized -w -Wno-psabi" } */ + +typedef float v4sf __attribute__((vector_size(16))); +v4sf f () { return (v4sf) { 0.0, -0.0, 0.0, -0.0 }; } + +/* { dg-final { scan-tree-dump {{ 0\.0, -0\.0, 0\.0, -0\.0 }} "optimized" } } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index fbed0c379b2..41d05294936 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -851,7 +851,8 @@ enum operand_equal_flag { /* Internal within inchash::add_expr: */ OEP_HASH_CHECK = 32, /* Makes operand_equal_p handle more expressions: */ - OEP_LEXICOGRAPHIC = 64 + OEP_LEXICOGRAPHIC = 64, + OEP_BITWISE = 128 }; /* Enum and arrays used for tree allocation stats. diff --git a/gcc/tree-vector-builder.h b/gcc/tree-vector-builder.h index 0e36cd17139..13af74ad834 100644 --- a/gcc/tree-vector-builder.h +++ b/gcc/tree-vector-builder.h @@ -82,7 +82,7 @@ tree_vector_builder::new_vector (tree type, unsigned int npatterns, inline bool tree_vector_builder::equal_p (const_tree elt1, const_tree elt2) const { - return operand_equal_p (elt1, elt2, 0); + return operand_equal_p (elt1, elt2, OEP_BITWISE); } /* Return true if a stepped representation is OK. We don't allow -- 2.17.1