Tamar Christina <tamar.christ...@arm.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandif...@arm.com>
>> Sent: Monday, July 12, 2021 10:39 AM
>> To: Tamar Christina <tamar.christ...@arm.com>
>> Cc: Richard Biener <rguent...@suse.de>; nd <n...@arm.com>; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> where the sign for the multiplicant changes.
>> 
>> Tamar Christina <tamar.christ...@arm.com> writes:
>> > Hi,
>> >
>> >> Richard Sandiford <richard.sandif...@arm.com> writes:
>> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
>> >> *vinfo,
>> >> >>    /* FORNOW.  Can continue analyzing the def-use chain when this
>> >> >> stmt in
>> >> a phi
>> >> >>       inside the loop (in case we are analyzing an outer-loop).  */
>> >> >>    vect_unpromoted_value unprom0[2];
>> >> >> +  enum optab_subtype subtype = optab_vector;
>> >> >>    if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
>> >> WIDEN_MULT_EXPR,
>> >> >> -                           false, 2, unprom0, &half_type))
>> >> >> +                           false, 2, unprom0, &half_type, &subtype))
>> >> >> +    return NULL;
>> >> >> +
>> >> >> +  if (subtype == optab_vector_mixed_sign
>> >> >> +      && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> +      && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
>> >> >> + (unprom_mult.type))
>> >> >>      return NULL;
>> >> >
>> >> > Isn't the final condition here instead that TYPE1 is narrower than 
>> >> > TYPE2?
>> >> > I.e. we need to reject the case in which we multiply a signed and
>> >> > an unsigned value to get a (logically) signed result, but then
>> >> > zero-extend it (rather than sign-extend it) to the precision of the
>> addition.
>> >> >
>> >> > That would make the test:
>> >> >
>> >> >   if (subtype == optab_vector_mixed_sign
>> >> >       && TYPE_UNSIGNED (unprom_mult.type)
>> >> >       && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
>> >> >     return NULL;
>> >> >
>> >> > instead.
>> >>
>> >> And folding that into the existing test gives:
>> >>
>> >>   /* If there are two widening operations, make sure they agree on the
>> sign
>> >>      of the extension.  The result of an optab_vector_mixed_sign operation
>> >>      is signed; otherwise, the result has the same sign as the operands.  
>> >> */
>> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >>       && (subtype == optab_vector_mixed_sign
>> >>     ? TYPE_UNSIGNED (unprom_mult.type)
>> >>     : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
>> >>     return NULL;
>> >>
>> >
>> > I went with the first one which doesn't add the extra constraints for
>> > the normal dotproduct as that makes it too restrictive. It's the type
>> > of the multiplication that determines the operation so dotproduct can
>> > be used a bit more than where we currently do.
>> >
>> > This was relaxed in an earlier patch.
>> 
>> I didn't mean that we should add extra constraints to the normal case though.
>> The existing test I was referring to above was:
>> 
>>   /* If there are two widening operations, make sure they agree on
>>      the sign of the extension.  */
>>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>>       && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
>>     return NULL;
>
> But as I mentioned, this restriction is unneeded and has been removed hence 
> why it's not in my patchset's diff.
> It's removed by 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which Richi 
> conditioned on
> the rest of these patches being approved.
>
> This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from being 
> dotproducts for instance
>
> It's also part of the deficiency between GCC codegen and Clang 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6

Hmm, OK.  Just removing the check regresses:

unsigned long __attribute__ ((noipa))
f (signed short *x, signed short *y)
{
  unsigned long res = 0;
  for (int i = 0; i < 100; ++i)
    res += (unsigned int) x[i] * (unsigned int) y[i];
  return res;
}

int
main (void)
{
  signed short x[100], y[100];
  for (int i = 0; i < 100; ++i)
    {
      x[i] = -1;
      y[i] = 1;
    }
  if (f (x, y) != 0x6400000000ULL - 100)
    __builtin_abort ();
  return 0;
}

on SVE.  We then use SDOT even though the result of the multiplication
is zero- rather than sign-extended to 64 bits.  Does something else
in the series stop that from that happening?

Richard

Reply via email to