On Fri, Aug 12, 2011 at 4:03 AM, Artem Shinkarov
<artyom.shinkar...@gmail.com> wrote:
> Hi
>
> Here is a completed version of the vector comparison patch we
> discussed a long time ago here:
> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01184.html
>
> The patch implements vector comparison according to the OpenCL
> standard, when the result of the comparison of two vectors is vector
> of signed integers, where -1 represents true and 0 false.
>
> The patch implements vector conditional res = VCOND<V1 ? V2 : V3>
> which is expanded into:
> foreach (i in length (V1)) res[i] = V1 == 0 ? V3[i] : V2[i].

Some comments on the patch below.  First, in general I don't see
why you need a new target hook to specify whether to "vectorize"
a comparison.  Why are the existing hooks used by the vectorizer
not enough?

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c    (revision 177665)
+++ gcc/fold-const.c    (working copy)
@@ -9073,34 +9073,61 @@ fold_comparison (location_t loc, enum tr
      floating-point, we can only do some of these simplifications.)  */
   if (operand_equal_p (arg0, arg1, 0))
     {
-      switch (code)
+      if (TREE_CODE (TREE_TYPE (arg0)) == VECTOR_TYPE)
        {
-       case EQ_EXPR:
-         if (! FLOAT_TYPE_P (TREE_TYPE (arg0))
-             || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0))))
-           return constant_boolean_node (1, type);

I think this change should go in a separate patch for improved
constant folding.  It shouldn't be necessary for enabling vector compares, no?

+      if (TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (ifexp))))
+        {
+          error_at (colon_loc, "vector comparison must be of signed "
+                              "integer vector type");
+          return error_mark_node;
+        }

why that?

+      if (TYPE_VECTOR_SUBPARTS (type1) != TYPE_VECTOR_SUBPARTS (type2)
+          || TYPE_VECTOR_SUBPARTS (TREE_TYPE (ifexp))
+             != TYPE_VECTOR_SUBPARTS (type1))
+        {
+          error_at (colon_loc, "vectors of different length found in "
+                               "vector comparison");
+          return error_mark_node;
+        }

I miss verification that type1 and type2 are vector types, or is that done
elsewhere?  I think type1 and type2 are already verified to be
compatible (but you might double-check).  At least the above would be
redundant with

+      if (type1 != type2)
+        {
+          error_at (colon_loc, "vectors of different types involved in "
+                               "vector comparison");
+          return error_mark_node;
+        }

Joseph may have comments about the fully-fold stuff that follows.

+      /* Currently the expansion of VEC_COND_EXPR does not allow
+        expessions where the type of vectors you compare differs
+        form the type of vectors you select from. For the time
+        being we insert implicit conversions.  */
+      if ((COMPARISON_CLASS_P (ifexp)

Why only for comparison-class?

+          && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != type1)
+         || TREE_TYPE (ifexp) != type1)
+       {
+         tree comp_type = COMPARISON_CLASS_P (ifexp)
+                          ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
+                          : TREE_TYPE (ifexp);
+         tree vcond;
+
+         op1 = convert (comp_type, op1);
+         op2 = convert (comp_type, op2);
+         vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
+         vcond = convert (type1, vcond);
+         return vcond;
+       }
+      else
+       return build3 (VEC_COND_EXPR, type1, ifexp, op1, op2);

In the end we of course will try to fix the middle-end/backends to
allow mixed types here as the current restriction doesn't really make sense.

     case EQ_EXPR:
     case NE_EXPR:
+      if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE)
+        {
+          tree intt;
+          if (TREE_TYPE (type0) != TREE_TYPE (type1))
+            {
+              error_at (location, "comparing vectors with different "
+                                  "element types");
+              return error_mark_node;
+            }
+
+          if (TYPE_VECTOR_SUBPARTS (type0) != TYPE_VECTOR_SUBPARTS (type1))
+            {
+              error_at (location, "comparing vectors with different "
+                                  "number of elements");
+              return error_mark_node;
+            }

as above - compatibility should already be ensured, thus type0 == type1
here?

+/* Try a hardware hook for vector comparison or
+   extract comparison piecewise.  */
+static tree
+expand_vector_comparison (gimple_stmt_iterator *gsi, tree type, tree op0,
+                          tree op1, enum tree_code code)

comments should mention and describe all function arguments.

+/* Expand vector condition EXP which should have the form
+   VEC_COND_EXPR<cond, vec0, vec1> into the following
+   vector:
+     {cond[i] != 0 ? vec0[i] : vec1[i], ... }
+   i changes from 0 to TYPE_VECTOR_SUBPARTS (TREE_TYPE (vec0)).  */
+static tree
+earlyexpand_vec_cond_expr (gimple_stmt_iterator *gsi, tree exp)

that would be expand_vec_cond_expr_piecewise, no?

+  /* Ensure that we will be able to expand vector comparison
+     in case it is not supported by the architecture.  */
+  gcc_assert (COMPARISON_CLASS_P (cond));

that looks dangerous to me - did you try

 vec = v1 <= v2;
 vec2 = vec ? v1 : v2;

without optimization?

+  /* Check if we need to expand vector condition inside of
+     VEC_COND_EXPR.  */
+  var = create_tmp_reg (TREE_TYPE (cond), "cond");
+  new_rhs = expand_vector_comparison (gsi, TREE_TYPE (cond),
+                                      TREE_OPERAND (cond, 0),
+                                     TREE_OPERAND (cond, 1),
+                                      TREE_CODE (cond));

That unconditionally expands, so no need for "Check".

+  /* Expand VEC_COND_EXPR into a vector of scalar COND_EXPRs.  */
+  v = VEC_alloc(constructor_elt, gc, nunits);
+  for (i = 0; i < nunits;
+       i += 1, index = int_const_binop (PLUS_EXPR, index, part_width))
+    {
+      tree tcond = tree_vec_extract (gsi, inner_type, var, part_width, index);
+      tree a = tree_vec_extract (gsi, inner_type, vec0, part_width, index);
+      tree b = tree_vec_extract (gsi, inner_type, vec1, part_width, index);
+      tree rcond = gimplify_build2 (gsi, NE_EXPR, boolean_type_node, tcond,
+                                    build_int_cst (inner_type ,0));

I seriously doubt that when expanding this part piecewise expanding
the mask first in either way is going to be beneficial.  Instead I would
suggest to "inline" the comparison here.  Thus instead of

 mask =
         = { mask[0] != 0 ? ... }

do

          = { c0[0] < c1[0] ? ..., }

or even expand the ? : using mask operations if we efficiently can
create that mask.


+  /* Check if VEC_COND_EXPR is supported in hardware within the
+     given types.  */
+  if (code == VEC_COND_EXPR)
+    {
+      tree exp = gimple_assign_rhs1 (stmt);
+      tree cond = TREE_OPERAND (exp, 0);
+
+      /* If VEC_COND_EXPR is presented as A ? V0 : V1, we
+        change it to A != {0,0,...} ? V0 : V1  */
+      if (!COMPARISON_CLASS_P (cond))
+       TREE_OPERAND (exp, 0) =
+           build2 (NE_EXPR, TREE_TYPE (cond), cond,
+                   build_vector_from_val (TREE_TYPE (cond),
+                     build_int_cst (TREE_TYPE (TREE_TYPE (cond)), 0)));

That looks inefficient as well.  Iff we know that the mask is always
either {-1, -1 ..} or {0, 0 ...} then we can expand the ? : using
bitwise operations (see what the i?86 expander does, for example).

@@ -471,6 +603,7 @@ expand_vector_operations_1 (gimple_stmt_

   gcc_assert (code != CONVERT_EXPR);

+
   /* The signedness is determined from input argument.  */
   if (code == VEC_UNPACK_FLOAT_HI_EXPR
       || code == VEC_UNPACK_FLOAT_LO_EXPR)

spurious whitespace change.

Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c      (revision 177665)
+++ gcc/tree-cfg.c      (working copy)
@@ -3191,6 +3191,38 @@ verify_gimple_comparison (tree type, tre
       return true;
     }

+  if (TREE_CODE (op0_type) == VECTOR_TYPE
+      && TREE_CODE (op1_type) == VECTOR_TYPE
+      && TREE_CODE (type) == VECTOR_TYPE)
+    {

this should check TREE_CODE (type) == VECTOR_TYPE only
and then verify the comparison operands are actually vectors.

+      if (TREE_TYPE (op0_type) != TREE_TYPE (op1_type))
+        {
+          error ("invalid vector comparison, vector element type mismatch");
+          debug_generic_expr (op0_type);
+          debug_generic_expr (op1_type);
+          return true;
+        }

this needs to use code similar to the scalar variant,

          !useless_type_conversion_p (op0_type, op1_type)
          && !useless_type_conversion_p (op1_type, op0_type)

which also makes the first TYPE_VECTOR_SUBPARTS redundant.

+      if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
+          && TYPE_PRECISION (TREE_TYPE (op0_type))
+             != TYPE_PRECISION (TREE_TYPE (type)))
+        {
+          error ("invalid vector comparison resulting type");
+          debug_generic_expr (type);
+          return true;
+        }

I think you can drop the TYPE_PRECISION check.  We might want to
assert that a vector element types precision always matches its
mode precision (in make_vector_type).

Index: gcc/c-parser.c
===================================================================
--- gcc/c-parser.c      (revision 177665)
+++ gcc/c-parser.c      (working copy)
@@ -5337,8 +5337,17 @@ c_parser_conditional_expression (c_parse
   if (c_parser_next_token_is (parser, CPP_COLON))
     {
       tree eptype = NULL_TREE;
-
+
       middle_loc = c_parser_peek_token (parser)->location;
+
+      if (TREE_CODE (TREE_TYPE (cond.value)) == VECTOR_TYPE)

watch out for whitespace changes - you add a trailing tab here.

+/* Find target specific sequence for vector comparison of
+   real-type vectors V0 and V1. Returns variable containing
+   result of the comparison or NULL_TREE in other case.  */
+static tree
+vector_fp_compare (gimple_stmt_iterator *gsi, tree rettype,
+                   enum machine_mode mode, tree v0, tree v1,
+                   enum tree_code code)
+{
+  enum ix86_builtins fcode;

is there a reason we need this and cannot simply provide expanders
for the named patterns?  We'd need to give them semantics of
producing all-ones / all-zero masks of course.  Richard, do you think
that's sensible?  That way we'd avoid the new target hook and could
simply do optab queries.

Thanks,
Richard.

> ChangeLog
>
> 2011-08-12 Artjoms Sinkarovs <artyom.shinkar...@gmail.com>
>
>       gcc/
>       * targhooks.c (default_builtin_vec_compare): New hook.
>       * targhooks.h (default_builtin_vec_compare): New definition.
>       * target.def (builtin_vec_compare): New hook.
>       * target.h: New include (gimple.h).
>       * fold-const.c
>       (fold_comparison): Adjust x <cmp> x vector operations.
>       * c-typeck.c (build_binary_op): Allow vector comparison.
>       (c_obj_common_truthvalue_conversion): Deny vector comparison
>       inside of if statement.
>       (build_conditional_expr): Adjust to build VEC_COND_EXPR.
>       * tree-vect-generic.c (do_compare): Helper function.
>       (expand_vector_comparison): Check if hardware comparison
>       is available, if not expand comparison piecewise.
>       (expand_vector_operation): Handle vector comparison
>       expressions separately.
>       (earlyexpand_vec_cond_expr): Expand vector comparison
>       piecewise.
>       * Makefile.in: New dependencies.
>       * tree-cfg.c (verify_gimple_comparison): Allow vector
>       comparison operations in gimple.
>       * c-parser.c (c_parser_conditional_expression): Adjust
>       to handle VEC_COND_EXPR.
>       * gimplify.c (gimplify_expr): Adjust to handle VEC_COND_EXPR.
>       * config/i386/i386.c (vector_fp_compare): Build hardware
>       specific code for floating point vector comparison.
>       (vector_int_compare): Build hardware specific code for
>       integer vector comparison.
>       (ix86_vectorize_builtin_vec_compare): Implementation of
>       builtin_vec_compare hook.
>
>       gcc/testsuite/
>       * gcc.c-torture/execute/vector-vcond-1.c: New test.
>       * gcc.c-torture/execute/vector-vcond-2.c: New test.
>       * gcc.c-torture/execute/vector-compare-1.c: New test.
>       * gcc.c-torture/execute/vector-compare-2.c: New test.
>       * gcc.dg/vector-compare-1.c: New test.
>       * gcc.dg/vector-compare-2.c: New test.
>
>       gcc/doc
>       * extend.texi: Adjust.
>       * tm.texi: Adjust.
>       * tm.texi.in: Adjust.
>
>
> bootstrapped and tested on x86_64_unknown-linux.
>
>
> Thanks,
> Artem Shinkarov.
>

Reply via email to