https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94941

            Bug ID: 94941
           Summary: Expansion of some internal fns can drop the lhs on the
                    floor
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---

internal-fn.c:expand_mask_load_optab_fn uses expand_insn to emit the
load instruction, but doesn't then test whether the coerced output
operand is the same as the target of the gcall.  It might not be,
for example, in unoptimised code, where the target of the gcall
expands to a MEM rtx and the load insn requires a REG destination.
We need the equivalent of:

  if (!rtx_equal_p (lhs_rtx, ops[0].value))
    emit_move_insn (lhs_rtx, ops[0].value);

in expand_while_optab_fn.

This can be seen for AArch64 with the following test,
compiled with -O0 -march=armv8.2-a+sve:

----------------------------------------------------------
#include <arm_sve.h>

svfloat32_t
foo (float *ptr)
{
  svbool_t pg = svptrue_pat_b32 (SV_VL1);
  svfloat32_t res = svld1 (pg, ptr);
  return res;
}

int
main (void)
{
  svbool_t pg = svptrue_pat_b32 (SV_VL1);
  float x[1] = { 1 };
  if (svptest_any (pg, svcmpne (pg, foo (x), 1.0)))
    __builtin_abort ();
  return 0;
}
----------------------------------------------------------

We emit:
;; res_5 = .MASK_LOAD (ptr_4(D), 4B, _2);

(insn 9 8 10 (set (reg/f:DI 96)
        (mem/f/c:DI (plus:DI (reg/f:DI 87 virtual-stack-vars)
                (const_poly_int:DI [-40, -32])) [3 ptr+0 S8 A64]))
"/tmp/foo.c":7:21 -1
     (nil))

(insn 10 9 0 (set (reg:VNx4SF 97)
        (unspec:VNx4SF [
                (reg:VNx4BI 92 [ _2 ])
                (mem:VNx4SF (reg/f:DI 96) [0 MEM <svfloat32_t> [(float
*)ptr_4(D)]+0 S[16, 16] A8])
            ] UNSPEC_LD1_SVE)) "/tmp/foo.c":7:21 -1
     (nil))

but don't store reg 97 to the stack slot for "res".  Then the return
statement loads from "res":

(insn 12 11 0 (set (reg:VNx4SF 93 [ _6 ])
        (unspec:VNx4SF [
                (subreg:VNx4BI (reg:VNx16BI 98) 0)
                (mem/c:VNx4SF (plus:DI (reg/f:DI 87 virtual-stack-vars)
                        (const_poly_int:DI [-32, -32])) [2 res+0 S[16, 16]
A128])
            ] UNSPEC_PRED_X)) "/tmp/foo.c":8:10 -1
     (nil))

meaning we return uninitialised stack contents.

The same problem affects expand_load_lanes_optab_fn and
expand_gather_load_optab_fn.

I think this problem has existed since the mask load/store
functions were introduced, but it was probably latent until
GCC 10 because nothing would use them in unoptimised code.

Reply via email to