Victor Do Nascimento <victor.donascime...@arm.com> writes:
> Given the specification in the GCC internals manual defines the
> {u|s}dot_prod<m> standard name as taking "two signed elements of the
> same mode, adding them to a third operand of wider mode", there is
> currently ambiguity in the relationship between the mode of the first
> two arguments and that of the third.
>
> This vagueness means that, in theory, different modes may be
> supportable in the third argument.  This flexibility would allow for a
> given backend to add to the accumulator a different number of
> vectorized products, e.g. A backend may provide instructions for both:
>
>   accum += a[0] * b[0] + a[1] * b[1] + a[2] * b[2] + a[3] * b[3]
>
> and
>
>   accum += a[0] * b[0] + a[1] * b[1],
>
> as is now seen in the SVE2.1 extension to AArch64.  In spite of the
> aforementioned flexibility, modeling the dot-product operation as a
> direct optab means that we have no way to encode both input and the
> accumulator data modes into the backend pattern name, which prevents
> us from harnessing this flexibility.
>
> We therefore make all dot_prod optabs conversions, allowing, for
> example, for the encoding of both 2-way and 4-way dot product backend
> patterns.
>
> gcc/ChangeLog:
>
>       * optabs.def (sdot_prod_optab): Convert from OPTAB_D to
>       OPTAB_CD.
>       (udot_prod_optab): Likewise.
>       (usdot_prod_optab): Likewise.
>       * doc/md.texi (Standard Names): update entries for u,s and us
>       dot_prod names.
> ---
>  gcc/doc/md.texi | 18 +++++++++---------
>  gcc/optabs.def  |  6 +++---
>  2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 7f4335e0aac..2a74e473f05 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5748,15 +5748,15 @@ for (i = 0; i < LEN + BIAS; i++)
>      operand0 += operand2[i];
>  @end smallexample
>  
> -@cindex @code{sdot_prod@var{m}} instruction pattern
> -@item @samp{sdot_prod@var{m}}
> +@cindex @code{sdot_prod@var{m}@var{n}} instruction pattern
> +@item @samp{sdot_prod@var{m}@var{n}}
>  
>  Compute the sum of the products of two signed elements.
>  Operand 1 and operand 2 are of the same mode. Their
>  product, which is of a wider mode, is computed and added to operand 3.
>  Operand 3 is of a mode equal or wider than the mode of the product. The
>  result is placed in operand 0, which is of the same mode as operand 3.
> -@var{m} is the mode of operand 1 and operand 2.
> +@var{m} is the mode of operands 0 and 3 and @var{n} the mode of operands 1 
> and 2.

Now that we can put names to both modes, how about replacing the
description with something like this:

Multiply operand 1 by operand 2 without loss of precision, given that
both operands contain signed elements.  Add each product to the overlapping
element of operand 3 and store the result in operand 0.  Operands 0 and 3
have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}
having narrower elements than @var{m}.

This is all personal taste though, so it's just a suggestion.

Same idea for the others.

OK with that change from my POV, but happy to hear other suggestions.

Thanks,
Richard

>  Semantically the expressions perform the multiplication in the following 
> signs
>  
> @@ -5766,15 +5766,15 @@ sdot<signed op0, signed op1, signed op2, signed op3> 
> ==
>  @dots{}
>  @end smallexample
>  
> -@cindex @code{udot_prod@var{m}} instruction pattern
> -@item @samp{udot_prod@var{m}}
> +@cindex @code{udot_prod@var{m}@var{n}} instruction pattern
> +@item @samp{udot_prod@var{m}@var{n}}
>  
>  Compute the sum of the products of two unsigned elements.
>  Operand 1 and operand 2 are of the same mode. Their
>  product, which is of a wider mode, is computed and added to operand 3.
>  Operand 3 is of a mode equal or wider than the mode of the product. The
>  result is placed in operand 0, which is of the same mode as operand 3.
> -@var{m} is the mode of operand 1 and operand 2.
> +@var{m} is the mode of operands 0 and 3 and @var{n} the mode of operands 1 
> and 2.
>  
>  Semantically the expressions perform the multiplication in the following 
> signs
>  
> @@ -5784,14 +5784,14 @@ udot<unsigned op0, unsigned op1, unsigned op2, 
> unsigned op3> ==
>  @dots{}
>  @end smallexample
>  
> -@cindex @code{usdot_prod@var{m}} instruction pattern
> -@item @samp{usdot_prod@var{m}}
> +@cindex @code{usdot_prod@var{m}@var{n}} instruction pattern
> +@item @samp{usdot_prod@var{m}@var{n}}
>  Compute the sum of the products of elements of different signs.
>  Operand 1 must be unsigned and operand 2 signed. Their
>  product, which is of a wider mode, is computed and added to operand 3.
>  Operand 3 is of a mode equal or wider than the mode of the product. The
>  result is placed in operand 0, which is of the same mode as operand 3.
> -@var{m} is the mode of operand 1 and operand 2.
> +@var{m} is the mode of operands 0 and 3 and @var{n} the mode of operands 1 
> and 2.
>  
>  Semantically the expressions perform the multiplication in the following 
> signs
>  
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index 45e117a7f50..fce4b2d5b08 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -106,6 +106,9 @@ OPTAB_CD(mask_scatter_store_optab, 
> "mask_scatter_store$a$b")
>  OPTAB_CD(mask_len_scatter_store_optab, "mask_len_scatter_store$a$b")
>  OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
>  OPTAB_CD(vec_init_optab, "vec_init$a$b")
> +OPTAB_CD (sdot_prod_optab, "sdot_prod$I$a$b")
> +OPTAB_CD (udot_prod_optab, "udot_prod$I$a$b")
> +OPTAB_CD (usdot_prod_optab, "usdot_prod$I$a$b")
>  
>  OPTAB_CD (while_ult_optab, "while_ult$a$b")
>  
> @@ -409,10 +412,7 @@ OPTAB_D (savg_floor_optab, "avg$a3_floor")
>  OPTAB_D (uavg_floor_optab, "uavg$a3_floor")
>  OPTAB_D (savg_ceil_optab, "avg$a3_ceil")
>  OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil")
> -OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
>  OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
> -OPTAB_D (udot_prod_optab, "udot_prod$I$a")
> -OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
>  OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
>  OPTAB_D (usad_optab, "usad$I$a")
>  OPTAB_D (ssad_optab, "ssad$I$a")

Reply via email to