https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81366

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|rguenth at gcc dot gnu.org         |unassigned at gcc dot 
gnu.org
                 CC|                            |rguenth at gcc dot gnu.org
           Keywords|                            |openmp
             Status|ASSIGNED                    |NEW

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we vectorize this now, but we emit a runtime check that's always false
around the loop.  This is from vect_do_peeling, the condition to skip_vector,
generated as

_8 + 4294967295 <= 2

with _8 defined as _8 = .GOMP_SIMD_VF (simduid.3_7(D))

and when updating .GOMP_SIMD_VF, which appears in BB2 in this case, there is
no htab, thus no VF recorded and we use VF == 1, which choses scalar code.
This might be because the loop in question has loop->simduid == NULL.

We do not vectorize the main computation loop (but only the init and the final
reduction loop), because

t.c:7:21: note:   ==> examining statement: pretmp_31 = .MASK_LOAD (_20, 64,
_49, 0.0);
t.c:7:21: note:   vect_is_simple_use: operand _21 < _27, type of def: internal
t.c:7:21: missed:   unsupported masked emulated gather.
t.c:7:17: missed:   not vectorized: relevant stmt not supported: pretmp_31 =
.MASK_LOAD (_20, 64, _49, 0.0);
t.c:7:21: note:   unsupported SLP instance starting from: D.36226[_15] =
prephitmp_30;
t.c:7:21: missed:  unsupported SLP instances

it seems that gather handling isn't working well, or that we fail to
recognize SIMD loads from .MASK_LOADs.

I also see

t.c:7:21: note:   === vect_analyze_data_ref_dependences ===
(compute_affine_dependence
  ref_a: MEM[(const double &)_19], stmt_a: _21 = MEM[(const double &)_19];
  ref_b: D.36226[_15], stmt_b: D.36226[_15] = prephitmp_30;
) -> no dependence
(compute_affine_dependence 
  ref_a: MEM <double[32]> [(const double &)&D.36226][_15], stmt_a: _27 = MEM
<double[32]> [(const double &)&D.36226][_15];
  ref_b: D.36226[_15], stmt_b: D.36226[_15] = prephitmp_30;
) -> dependence analysis failed
(compute_affine_dependence
  ref_a: MEM[(const double &)_20], stmt_a: pretmp_31 = .MASK_LOAD (_20, 64,
_49, 0.0);
  ref_b: D.36226[_15], stmt_b: D.36226[_15] = prephitmp_30;
) -> dependence analysis failed


Using -fno-tree-sink makes the loop vectorized.  We're facing

  _15 = .GOMP_SIMD_LANE (simduid.3_7(D), 0);
  _16 = (long unsigned int) i_37;
  _17 = _16 * 8;
  _19 = x_18(D) + _17;
  _32 = (sizetype) _15;
  _41 = _32 * 8;
  _20 = &D.36226 + _41;
  _21 = MEM[(const double &)_19];
  _27 = MEM <double[32]> [(const double &)&D.36226][_15];
  _48 = _21 < _27;
  pretmp_31 = .MASK_LOAD (_20, 64, _48, 0.0);

then instead of

  _53 = (unsigned long) &D.36226;
...

  _15 = .GOMP_SIMD_LANE (simduid.3_7(D), 0);
  _16 = (long unsigned int) i_37;
  _17 = _16 * 8;
  _19 = x_18(D) + _17;
  _21 = MEM[(const double &)_19];
  _27 = MEM <double[32]> [(const double &)&D.36226][_15];
  _49 = _21 < _27;
  _32 = (sizetype) _15;
  _41 = _32 * 8;
  _54 = _53 + _41;
  _20 = (double *) _54;
  pretmp_31 = .MASK_LOAD (_20, 64, _49, 0.0);

that is, we sink

  _32 = (sizetype) _15;
  _41 = _32 * 8;
  _20 = &D.36226 + _41;

and ifcvt turns the conditional POINTER_PLUS_EXPR into a PLUS_EXPR
(because UB), which analysis fails to handle.  I think this is the code in
vect_find_stmt_data_reference.  We fail to match

          if (integer_zerop (off)
              && TREE_CODE (base_address) == POINTER_PLUS_EXPR)
            {
              off = TREE_OPERAND (base_address, 1);
              base_address = TREE_OPERAND (base_address, 0);

for the base address (double *) ((sizetype) _15 * 8 + (sizetype) &D.36226).

The following is a crude hack lacking verification that fixes this,
vectorizing the main loop.  It would be more maintainable to move
this pattern matching to a match.pd match I think.  I'm not working on this.

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index ae556d85c7f..c320223926e 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5250,6 +5250,13 @@ vect_find_stmt_data_reference (loop_p loop, gimple
*stmt,
              off = TREE_OPERAND (base_address, 1);
              base_address = TREE_OPERAND (base_address, 0);
            }
+         else if (integer_zerop (off)
+                  && CONVERT_EXPR_CODE_P (TREE_CODE (base_address))
+                  && TREE_CODE (TREE_OPERAND (base_address, 0)) == PLUS_EXPR)
+           {
+             off = TREE_OPERAND (TREE_OPERAND (base_address, 0), 0);
+             base_address = TREE_OPERAND (TREE_OPERAND (base_address, 0), 1);
+           }
          STRIP_NOPS (off);
          if (TREE_CODE (off) == MULT_EXPR
              && tree_fits_uhwi_p (TREE_OPERAND (off, 1)))

Reply via email to