On Sun, 27 Aug 2017, Jon Beniston wrote:

> Hi,
> 
> I have an out-of-tree GCC port and it is struggling supporting
> auto-vectorization on some dot product instructions.  For example, I have an
> instruction that takes three operands which are all 32-bit general
> registers. The second and third operands will be treated as V2HI then do dot
> product, and then generate an SI result which is then added to the first
> operand which is SI as well.
> 
> I do see there is dot product recognizer in tree-vect-patters.c, however, I
> found the following testcase still can't be auto-vectorized on my port which
> has implemented all necessary dot product standard patterns.  This testcase
> can't be auto-vectorized on other targets that have similar V2HI dot product
> instructions as well, for example ARC.
> 
> === test.c ===
> #define K 4
> #define M 4
> #define N 256
> int in[N*K][M];
> int out[K];
> int coeff[N][M];
> void
> foo (void)
> {
>   int i, j, k;
>   int sum;
>   for (k = 0; k < K; k++)
>     {
>       sum = 0;
>       for (j = 0; j < M; j++)
>         for (i = 0; i < N; i++)
>           sum += in[i+k][j] * coeff[i][j];
>       out[k] = sum;
>     }
> }
> ===
>   The reason that auto-vectorizer doesn't work seems to be that GCC doesn't
> support single-element vector types in get_vectype_for_scalar_type_and_size.
> tree-vect-stmts.c: get_vectype_for_scalar_type_and_size
>   ...
>   if (nunits <= 1)
>     return NULL_TREE;
> 
> So, I am thinking this actually should be relaxed to support more cases.  At
> least on vector reduction operations which normally will have scalar result
> with wider types than the element type of input operands.
> 
> I have tried to make the auto-vectorizer work for my V2HI dot product case,
> with the patch attached. Is this the correct approach?

Hum,

@@ -9039,7 +9040,7 @@
   else
     simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits < 1) /* Support V1SI.  */
+  if (nunits < 1 || (nunits == 1 && !reduct_p))
     return NULL_TREE;

   vectype = build_vector_type (scalar_type, nunits);

doesn't seem to be against trunk which has

  if (nunits <= 1)
    return NULL_TREE;

so what happens if you just change that to

  if (nunits < 1)
    return NULL_TREE;

?

Richard.

> Cheers,
> Jon
> 
> gcc/
> 2017-08-27  Jon Beniston <j...@beniston.com>
> 
>         * tree-vectorizer.h (get_vectype_for_scalar_type): New optional
>         parameter declaration.
>         * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Add new
>         optional parameter "reduct_p".  Support single element vector types
>         if it is true.
>         (get_vectype_for_scalar_type): Add new parameter "reduct_p".
>         * tree-vect-patterns.c (vect_pattern_recog_1): Pass new parameter
>         "reduct_p".
>         * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
>         (vect_model_reduction_cost): Likewise.
>         (get_initial_def_for_induction): Likewise.
>         (vect_create_epilog_for_reduction): Likewise.
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to