https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116352

--- Comment #11 from Manolis Tsamis <tsamismanolis at gmail dot com> ---
Here's an outline of the involved BBs and the statement that causes the error:

;;   basic block 32, loop depth 0
;;    prev block 35, next block 5
;;    pred:       35 [89.4% (guessed)]
  ...
  c.1_182 = cD.2798;
  b.0_187 = bD.2797;
  ...

;;   basic block 38, loop depth 0
;;    prev block 37, next block 33
;;    pred:       37 [66.7% (guessed)]
  ...
  i_159 = b.0_187 * 2.0e+0;
  j_160 = c.1_182 * 2.0e+0;
  _119 = {j_160, _75, i_159, _73};
  _80 = VEC_PERM_EXPR <_119, _119, { 1, 1, 3, 3 }>;
  _150 = VEC_PERM_EXPR <_119, _119, { 0, 0, 2, 2 }>;
  vect__170.26_17 = _150 + _80;
  vect__164.25_16 = _150 - _80;
  ...

;;   basic block 29, loop depth 1, count 95563020 (estimated locally, freq
0.8091), maybe hot
;;    prev block 33, next block 30, flags: (NEW, VISITED)
;;    pred:       33 [always]  count:10511932 (estimated locally, freq 0.0890)
(FALLTHRU,EXECUTABLE)
;;                30 [always]  count:85051088 (estimated locally, freq 0.7201)
(FALLTHRU,DFS_BACK,EXECUTABLE)
  ...
  b.0_125 = bD.2797;
  c.1_126 = cD.2798;
  ...
  i_131 = b.0_125 * 2.0e+0;
  j_132 = c.1_126 * 2.0e+0;
  _21 = {j_132, _75, i_131, _73};
  _22 = VEC_PERM_EXPR <_21, _21, { 0, 0, 2, 2 }>;
  vect__142.30_1 = _22 + _80;
  vect__136.29_65 = _22 - _80;  # _80 is defined in <bb 38> but <bb 38> doesn't
dominate this block.
  ...

I looked at how _80 ends up in <bb 29> and it is due to caching through
bst_map. An SLP tree that corresponds to LHS = VEC_PERM_EXPR <_21, _21, { 1, 1,
3, 3 }>; would be emitted, but because it has the same effective scalar
statements (_75, _75 _73, _73) it is replaced with _80.

This happens during vect_optimize_slp, throught vect_cse_slp_nodes:

  /* Apply CSE again to nodes after permute optimization.  */
  scalar_stmts_to_slp_tree_map_t *bst_map
    = new scalar_stmts_to_slp_tree_map_t ();

  for (auto inst : vinfo->slp_instances)
    vect_cse_slp_nodes (bst_map, SLP_INSTANCE_TREE (inst));

  release_scalar_stmts_to_slp_tree_map (bst_map);

 In this case vinfo->slp_instances has multiple entries and they correspond to
multiple BBs (as seen above). The caching code uses a single bst_map instance
and vect_cse_slp_nodes doesn't have any checks for the BB in question. Due to
this, this looks like a general caching bug to me, that just happens to trigger
here after r15-2820-gab18785840d7b8 (except of course if I'm missing
something). We allow replacing nodes between BBs without checking if that would
be fine.

I'm not sure about that, but if a single slp_instance is guaranteed to affect a
single BB, we could use a bst_map per instance as a solution (in this case this
solves the issue, but I want to be sure we address the underlying problem too).

Reply via email to