create_output_operand coerces an output operand to the insn's predicates, using a suggested rtx location if convenient. But if that rtx location is actually required rather than optional, the builder of the insn has to emit a move afterwards.
(We could instead add a new interface that does this automatically, but that's future work.) This PR shows that we were failing to emit the move for some of the vector load internal functions. I think there are other routines in internal-fn.c that potentially have the same problem, but this patch is supposed to be a conservative subset suitable for backporting to GCC 10. Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf and x86_64-linux-gnu. OK for trunk and GCC 10 branch? It would be nice if we could have this for GCC 10.1, but I realise it wouldn't justify a new RC on its own, so I'll assume that "GCC 10 branch" is "GCC 10 branch after the release". Richard 2020-05-04 Richard Sandiford <richard.sandif...@arm.com> gcc/ PR middle-end/94941 * internal-fn.c (expand_load_lanes_optab_fn): Emit a move if the chosen lhs is different from the gcall lhs. (expand_mask_load_optab_fn): Likewise. (expand_gather_load_optab_fn): Likewise. gcc/testsuite/ PR middle-end/94941 * gcc.target/aarch64/sve/acle/general/unoptimized_1.c: New test. --- gcc/internal-fn.c | 6 ++++++ .../aarch64/sve/acle/general/unoptimized_1.c | 21 +++++++++++++++++++ 2 files changed, 27 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/unoptimized_1.c diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 52d1638917a..5e9aa60721e 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -167,6 +167,8 @@ expand_load_lanes_optab_fn (internal_fn, gcall *stmt, convert_optab optab) create_output_operand (&ops[0], target, TYPE_MODE (type)); create_fixed_operand (&ops[1], mem); expand_insn (get_multi_vector_move (type, optab), 2, ops); + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); } /* Expand STORE_LANES call STMT using optab OPTAB. */ @@ -2507,6 +2509,8 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab) create_fixed_operand (&ops[1], mem); create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt))); expand_insn (icode, 3, ops); + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); } #define expand_mask_load_lanes_optab_fn expand_mask_load_optab_fn @@ -2827,6 +2831,8 @@ expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab) insn_code icode = convert_optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)), TYPE_MODE (TREE_TYPE (offset))); expand_insn (icode, i, ops); + if (!rtx_equal_p (lhs_rtx, ops[0].value)) + emit_move_insn (lhs_rtx, ops[0].value); } /* Expand DIVMOD() using: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/unoptimized_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/unoptimized_1.c new file mode 100644 index 00000000000..18d73e21a83 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/unoptimized_1.c @@ -0,0 +1,21 @@ +/* { dg-do run { target aarch64_sve_hw } } */ + +#include <arm_sve.h> + +svfloat32_t +foo (float *ptr) +{ + svbool_t pg = svptrue_pat_b32 (SV_VL1); + svfloat32_t res = svld1 (pg, ptr); + return res; +} + +int +main (void) +{ + svbool_t pg = svptrue_pat_b32 (SV_VL1); + float x[1] = { 1 }; + if (svptest_any (pg, svcmpne (pg, foo (x), 1.0))) + __builtin_abort (); + return 0; +}