On Mon, 13 Aug 2012, Richard Guenther wrote:

On Sat, Aug 11, 2012 at 2:25 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
On Fri, 10 Aug 2012, Marc Glisse wrote:

1) I am not sure we always want to combine permutations. Indeed, someone
(user? vectorizer?) may have written 2 permutations to help the backend
generate optimal code, whereas it will do a bad job on the complicated
combined permutation. At tree level, I am not sure what can be done except
possibly restrict the optimization to the case where the combined
permutation is the identity. At the RTL level, we could compare what insns
are generated, but that's going to be painful (doing anything with
permutations at the RTL level is painful).

I guess people will complain soon enough if this causes horrible performance regressions in vectorized code.

+/* Return true if VAR has no nondebug use but in stmt.  */
+static bool
+is_only_used_in (const_tree var, const_gimple stmt)
+{
+  const ssa_use_operand_t *const ptr0 = &(SSA_NAME_IMM_USE_NODE (var));
+  const ssa_use_operand_t *ptr = ptr0->next;
+
+  for (; ptr != ptr0; ptr = ptr->next)
+    {
+      const_gimple use_stmt = USE_STMT (ptr);
+      if (is_gimple_debug (use_stmt))
+       continue;
+      if (use_stmt != stmt)
+       return false;
+    }
+  return true;
+}

if (!single_imm_use (var, &use, &use_stmt) || use_stmt != stmt)

instead of the above.

I don't think that works with statements that use a variable twice.

+/* Combine two shuffles in a row.  Returns 1 if there were any changes
+   made, 2 if cfg-cleanup needs to run.  Else it returns 0.  */
+
+static int
+simplify_permutation (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  gimple def_stmt;
+  tree op0, op1, op2, op3, mask;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  enum tree_code code2;
+  location_t loc = gimple_location (stmt);
+
+  gcc_checking_assert (code == VEC_PERM_EXPR);
+
+  op0 = gimple_assign_rhs1 (stmt);
+  op1 = gimple_assign_rhs2 (stmt);
+  op2 = gimple_assign_rhs3 (stmt);
+
+  if (TREE_CODE (op0) != SSA_NAME)
+    return 0;
+
+  if (TREE_CODE (op2) != VECTOR_CST)
+    return 0;
+
+  def_stmt = SSA_NAME_DEF_STMT (op0);
+  if (!def_stmt || !is_gimple_assign (def_stmt)
+      || !can_propagate_from (def_stmt)
+      || !is_only_used_in (op0, stmt))

Or rather than the above, simply check

          || !has_single_use (op0)

here.

Then there was my previous (non-working) patch that used get_prop_source_stmt.

+    return 0;
+
+  /* Check that it is only used here. We cannot use has_single_use
+     since the expression is using it twice itself...  */

Ah ... so then

          || num_imm_uses (op0) != 2

Ah, ok, that's simpler indeed, but there were such dire warnings to never use that evil function unless absolutely necessary that I didn't dare use it... Thanks for the permission.

+  code2 = gimple_assign_rhs_code (def_stmt);
+
+  /* Two consecutive shuffles.  */
+  if (code2 == VEC_PERM_EXPR)
+    {
+      op3 = gimple_assign_rhs3 (def_stmt);
+      if (TREE_CODE (op3) != VECTOR_CST)
+       return 0;
+      mask = fold_build3_loc (loc, VEC_PERM_EXPR, TREE_TYPE (op3),
+                             op3, op3, op2);
+      gcc_assert (TREE_CODE (mask) == VECTOR_CST);

which means you can use

         mask = fold_ternary (loc, ...);

Ok with that changes.

Thanks a lot, I'll do that.

Next step should be either BIT_FIELD_REF(VEC_PERM_EXPR) or VEC_PERM_EXPR(CONSTRUCTOR). Is there a good way to determine whether this kind of combination should be done forwards or backwards (i.e. start from VEC_PERM_EXPR and look if its single use is in BIT_FIELD_REF, or start from BIT_FIELD_REF and look if its argument is a VEC_PERM_EXPR only used here?

--
Marc Glisse

Reply via email to