Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)

Marc Glisse Wed, 25 Jun 2014 22:44:24 -0700

On Wed, 25 Jun 2014, Jakub Jelinek wrote:

Since we've started folding (a << n) | (a >> (bitsize - n)) etc.
into rotates even vectors, we've regressed code quality on targets
where we do have vector shifts, but don't have vector rotates.


The following patch attempts to fix it, by telling veclower pass
not to actually lower it if we have vector shifts and bitwise or,
and fixing the expansion of vector rotates.

Thanks for handling this. It looks good to me, I am just wondering about afew unimportant details below.

static void
expand_vector_operations_1 (gimple_stmt_iterator *gsi)
{
  gimple stmt = gsi_stmt (*gsi);
-  tree lhs, rhs1, rhs2 = NULL, type, compute_type;
+  tree lhs, rhs1, rhs2 = NULL, type, compute_type = NULL_TREE;
  enum tree_code code;
-  enum machine_mode compute_mode;
  optab op = unknown_optab;
  enum gimple_rhs_class rhs_class;
  tree new_rhs;
@@ -1461,6 +1502,41 @@ expand_vector_operations_1 (gimple_stmt_
              && optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing)
            return;
        }
+
+      if (code == LROTATE_EXPR || code == RROTATE_EXPR)
+       {
+         compute_type = get_compute_type (code, op, type);
+         if (compute_type == type)
+           return;
+         /* Before splitting vector rotates into scalar rotates,
+            see if we can't use vector shifts and BIT_IOR_EXPR
+            instead.  For vector by vector rotates we'd also
+            need to check BIT_AND_EXPR and NEGATE_EXPR, punt there
+            for now, fold doesn't seem to create such rotates anyway.  */
+         if (compute_type == TREE_TYPE (type)
+             && !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2)))
+           {
+             optab oplv, opl, oprv, opr, opo;
+             oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector);
+             oprv = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
+             opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default);
+             opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar);
+             opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);

How well are we separating lshiftrt from ashiftrt? Are ROTATE_EXPR alwayson an unsigned type so it is fine? Or do we expand one in terms of theother if it isn't available so it doesn't matter?

+             if (optab_handler (oplv, TYPE_MODE (type)) != CODE_FOR_nothing
+                 && optab_handler (opl, TYPE_MODE (type)) == CODE_FOR_nothing
+                 && optab_handler (oprv, TYPE_MODE (type)) != CODE_FOR_nothing
+                 && optab_handler (opr, TYPE_MODE (type)) == CODE_FOR_nothing)
+               {
+                 opl = oplv;
+                 opr = oprv;
+               }

Maybe arrange the conditions in order (oplv!= && oprv!= && (opl== ||opr==)), or separate the replacement of opl and of opv into 2 separateifs?

Somehow, it feels like those checks should be somewhere inget_compute_type so we test both scalar and vector versions for each size,or we could call get_compute_type for both and pick the best.

+             compute_type = get_compute_type (LSHIFT_EXPR, opl, type);
+             if (compute_type == TREE_TYPE (type)
+                 || compute_type != get_compute_type (RSHIFT_EXPR, opr, type)
+                 || compute_type != get_compute_type (BIT_IOR_EXPR, opo, type))
+               compute_type = TREE_TYPE (type);

Since we have determined compute_type from ashift (let's assume that's theone least likely to exist), I would just check that optab is ok with usingthis mode for the other 2 ops. Here, if we have shifts in 128 bits and iorin both 128 and 256 bits, we will fail (I thought that might be the casein AVX, but apparently not). Plus it is faster ;-)

When we move to have vector rotate recognition in forwprop or match.pd, Iguess we'll have to make it conditional to !PROP_gimple_lvec to be safe.


Does rotate hit PR 56873? (I noticed the -mno-xop and no -mxop test)

--
Marc Glisse

Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)

Reply via email to