Re: [PATCH] MATCH: add abs support for half float

Richard Biener Fri, 02 Aug 2024 03:36:57 -0700

On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah
<kvivekana...@nvidia.com> wrote:
>
>
>
> > On 1 Aug 2024, at 10:46 pm, Richard Biener <richard.guent...@gmail.com> 
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
> > <kvivekana...@nvidia.com> wrote:
> >>
> >>
> >> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote:
> >>>
> >>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
> >>> <kugan...@gmail.com> wrote:
> >>>>
> >>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
> >>>> <richard.guent...@gmail.com> wrote:
> >>>>>
> >>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
> >>>>> <kugan...@gmail.com> wrote:
> >>>>>>
> >>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
> >>>>>> <richard.guent...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
> >>>>>>> <kugan...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski <pins...@gmail.com> 
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
> >>>>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Revised based on the comment and moved it into existing patterns 
> >>>>>>>>>> as.
> >>>>>>>>>>
> >>>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>>
> >>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
> >>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
> >>>>>>>>>>
> >>>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>>
> >>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>>>>>>>>
> >>>>>>>>> The testcase needs to make sure it runs only for targets that 
> >>>>>>>>> support
> >>>>>>>>> float16 so like:
> >>>>>>>>>
> >>>>>>>>> /* { dg-require-effective-target float16 } */
> >>>>>>>>> /* { dg-add-options float16 } */
> >>>>>>>> Added in the attached version.
> >>>>>>>
> >>>>>>> + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>  (for cmp (ge gt)
> >>>>>>>   (simplify
> >>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>> -        && bitwise_equal_p (@0, @1))
> >>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>>> +        && ((VECTOR_TYPE_P (type)
> >>>>>>> +             && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE 
> >>>>>>> (@1)))
> >>>>>>> +           || (!VECTOR_TYPE_P (type)
> >>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
> >>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> >>>>>>> +        && bitwise_equal_p (@1, @2))
> >>>>>>>
> >>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 now
> >>>>>>> with the convert still applied to @1 - that looks odd.  You are 
> >>>>>>> allowing
> >>>>>>> sign-changing conversions but doesn't that change ge/gt behavior?
> >>>>>>> Also why are sign/zero-extensions not OK for vector types?
> >>>>>> Thanks for the review.
> >>>>>> My main motivation here is for _Float16  as below.
> >>>>>>
> >>>>>> _Float16 absfloat16 (_Float16 x)
> >>>>>> {
> >>>>>>  float _1;
> >>>>>>  _Float16 _2;
> >>>>>>  _Float16 _4;
> >>>>>>  <bb 2> [local count: 1073741824]:
> >>>>>>  _1 = (float) x_3(D);
> >>>>>>  if (_1 < 0.0)
> >>>>>>    goto <bb 3>; [41.00%]
> >>>>>>  else
> >>>>>>    goto <bb 4>; [59.00%]
> >>>>>>  <bb 3> [local count: 440234144]:\
> >>>>>>  _4 = -x_3(D);
> >>>>>>  <bb 4> [local count: 1073741824]:
> >>>>>>  # _2 = PHI <_4(3), x_3(D)(2)>
> >>>>>>  return _2;
> >>>>>> }
> >>>>>>
> >>>>>> This is why I added  bitwise_equal_p test of @1 against @2 with
> >>>>>> TYPE_PRECISION checks.
> >>>>>> I agree that I will have to check for sign-changing conversions.
> >>>>>>
> >>>>>> Just to keep it simple, I disallowed vector types. I am not sure if
> >>>>>> this would  hit vec types. I am happy to handle this if that is
> >>>>>> needed.
> >>>>>
> >>>>> I think with __builtin_convertvector you should be able to construct
> >>>>> a testcase that does
> >>>> Thanks.
> >>>>
> >>>> For the pattern,
> >>>> ```
> >>>> /* A >=/> 0 ? A : -A    same as abs (A) */
> >>>> (for cmp (ge gt)
> >>>>  (simplify
> >>>>   (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>         && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>         && bitwise_equal_p (@0, @1))
> >>>>     (if (TYPE_UNSIGNED (type))
> >>>>      (absu:type @0)
> >>>>      (abs @0)))))
> >>>> ```
> >>>> the vector type doesn't seem right. For example, if we have a 4
> >>>> element vector with some negative and positive, I don't think  it
> >>>> makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
> >>>> missing it completely?
> >>>
> >>> Looks like I missed adding some vector testcases anyways here is one
> >>> to get this, note it is C++ due to the C front-end not support `?:`
> >>> for vectors yet (there is a patch).
> >>> ```
> >>> #define vect8 __attribute__((vector_size(8)))
> >>> vect8 int f(vect8 int a)
> >>> {
> >>>  vect8 int na = -a;
> >>>  return (a > 0) ? a : na;
> >>> }
> >>> ```
> >>> At -O2 before forwprop1, we have:
> >>> ```
> >>>  vector(2) intD.9 a_2(D) = aD.2796;
> >>>  vector(2) intD.9 naD.2799;
> >>>  vector(2) <signed-boolean:32> _1;
> >>>  vector(2) intD.9 _4;
> >>>
> >>>  na_3 = -a_2(D);
> >>>  _1 = a_2(D) > { 0, 0 };
> >>>  _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
> >>> ```
> >>> And forwprop using match is able to do:
> >>> ```
> >>> Applying pattern match.pd:6306, gimple-match-10.cc:19843
> >>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
> >>> Removing dead stmt:_1 = a_2(D) > { 0, 0 };
> >>> Removing dead stmt:na_3 = -a_2(D);
> >>> ```
> >>> (replace int with float and add  -fno-signed-zeros you can get the ABS 
> >>> also).
> >>>
> >>> Note comparisons with vector types always generate a vector boolean
> >>> type. So cond_expr will never show up with a vector comparison; only
> >>> vec_cond.
> >>
> >>
> >> Thanks for the information. I have revised the patch by adding test case 
> >> for the vector type. I will look at removing the VEC_COND_EXPR as a follow 
> >> up.
> >
> > This is likely a major task so do not get you distracted - I just mentioned 
> > it.
> >
> >> This is still rejected by the type checker.
> >> v2hi  absvect2 (v2hi x, int i) {
> >>    v2hi neg = -x;
> >>    v2si tmp = __builtin_convertvector (x, v2si);
> >>    return (tmp > 0) ? x : neg;
> >> }
> >> as:
> >> error: incompatible vector types in conditional expression: '__vector(2) 
> >> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) 
> >> short int'}
> >> 8 |     return (tmp > 0) ? x : neg;
> >>    |            ~~~~~~~~~~^~~~~~~~~
> >>
> >> Also
> >>
> >> -      (absu:type @0)
> >> -      (abs @0)))))
> >> +      (absu:type @1)
> >> +      (abs @1)))))
> >>
> >> Changing the @1 to @2  is resulting in failures.  Existing tests like  the 
> >> following would be optimized away in this case.
> >> ```
> >> unsigned abs_with_convert0 (int x)
> >> {
> >>    unsigned int y = x;
> >>
> >>    if (x < 0)
> >>        y = -y;
> >>
> >>  return y;
> >> }
> >
> > How so - the TYPE_UNSIGNED tests should make the pattern not trigger here?
> >
> >> ```
> >> This follows the same intent as the original pattern.
> >>
> >> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for 
> >> trunk.
> >
> >
> >
> > -   (cnd (cmp @0 zerop) @1 (negate @1))
> > -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> > -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> > -        && bitwise_equal_p (@0, @1))
> > +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> > +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> > +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> > +        && !TYPE_UNSIGNED (TREE_TYPE (@0))
> >
> > I think the outer type could be allowed signed if ...
> >
> > +        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
> > +             && VECTOR_TYPE_P (TREE_TYPE (@1))
> > +             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
> > +                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
> >
> > this is always true
> >
> > +             && element_precision (TREE_TYPE (@1))
> > +                 <= element_precision (TREE_TYPE (@0)))
> > +           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
> > +               && (TYPE_PRECISION (TREE_TYPE (@1))
> > +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> >
> > ... you make the precision strictly larger.  With both unsigned the same
> > precision case should have been stripped anyway.
> >
> > You can use element_precision for both vector and non-vector so I think this
> > should simplify to just checking element_precision.
> >
> > +      (absu:type @1)
> > +      (abs @1)))))
> >
> > I still think this needs to be @2 for type correctness.  @1 is only
> > bitwise equal,
> > it could have different sign.  I think we should drop bitwise_equal_p with 
> > the
> > convert now in and instead have a matching constraint.
>
> Thanks, so this should translate to:
>
> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>  (for cmp (ge gt)
>   (simplify
>    (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>     (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>          && !TYPE_UNSIGNED (TREE_TYPE (@1))


Oh, I mis-read this as unsigned, so this makes the conversions
to always sign-extend which means it's important to ensure
the type of @0 is also signed.

>          && !TYPE_UNSIGNED (TREE_TYPE (@2))
>          && ((element_precision (TREE_TYPE (@1))
>                 < element_precision (TREE_TYPE (@0))
>               || operand_equal_p (@1, @0)))

which means <= might be better then.


>          && bitwise_equal_p (@1, @2))
>      (if (TYPE_UNSIGNED (type))
>       (absu:type @2)
>       (abs @2)))))
>
> However with this, I am seeing: Note that the @2 for these are unsigned.

I see, so we rely on @1 being the signed equivalent of @2 which might or
might not be signed.  I guess that indeed we want @2 here.

> Tests that now fail, but worked before (4 tests):
>
> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if "
> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 2
> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if "gcc: 
> gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 "ABSU_EXPR <" 2
>
>
> unsigned f2(int A)
> {
>   unsigned t = A;
> //     A >= 0? A : -A    same as abs (A)
>   if (A >= 0)  return t;
>   return -t;
> }
>
> unsigned abs_with_convert0 (int x)
> {
>     unsigned int y = x;
>
>     if (x < 0)
>         y = -y;
>
>   return y;
> }
> unsigned abs_with_convert1 (unsigned x)
> {
>     int y = x;
>
>     if (y < 0)
>         x = -x;
>
>   return x;
> }
>
> In the original pattern, we have check for  !TYPE_UNSIGNED (TREE_TYPE(@0)) 
> and  (absu:type @0)
>
> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and  (absu:type @1) in 
> the new pattern
>
> Thanks,
> Kugan
>
> >
> > Same comments apply to the 2nd pattern update.
> >
> > Richard.
> >
> >> Thanks,
> >> Kugan
> >>
> >> gcc/ChangeLog:
> >>
> >>        * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
> >>        Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>        * g++.dg/absvect.C: New test.
> >>        * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>
> >>
> >> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >>
> >>
> >> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
> >> codes by instead looking whether the condition is vector or scalar.  
> >> Fallout
> >> might be a bit too big, but it might be a thing to try.
> >>
> >> Richard.
> >>
> >>> Thanks,
> >>> Andrew Pinski
> >>>
> >>>>
> >>>> Thanks,
> >>>> Kugan
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> +      (absu:type @1)
> >>>>>>> +      (abs @1)))))
> >>>>>>>
> >>>>>>> I think this should use @2 now.
> >>>>>> I will change this.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Kugan
> >>>>>>
> >>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>> Kugan
> >>>>>>>>>
> >>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others).
> >>>>>>>>>
> >>>>>>>>> Other than that it looks good but I can't approve it.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Andrew Pinski
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >>>>>>>>>>
> >>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
> >>>>>>>>>> for trunk?
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Kugan
> >>>>>>>>>>
> >>>>>>>>>> ________________________________
> >>>>>>>>>> From: Andrew Pinski <pins...@gmail.com>
> >>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM
> >>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
> >>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com>
> >>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float
> >>>>>>>>>>
> >>>>>>>>>> External email: Use caution opening links or attachments
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
> >>>>>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> This patch extends abs detection in matched for half float.
> >>>>>>>>>>>
> >>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
> >>>>>>>>>>> for trunk?
> >>>>>>>>>>
> >>>>>>>>>> This is basically this pattern:
> >>>>>>>>>> ```
> >>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>>>> (for cmp (ge gt)
> >>>>>>>>>>  (simplify
> >>>>>>>>>>   (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>>>>>    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>>>>>         && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>>>>>         && bitwise_equal_p (@0, @1))
> >>>>>>>>>>     (if (TYPE_UNSIGNED (type))
> >>>>>>>>>>      (absu:type @0)
> >>>>>>>>>>      (abs @0)))))
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> except extended to handle an optional convert. Why didn't you just
> >>>>>>>>>> extend the above pattern to handle the convert instead? Also I 
> >>>>>>>>>> think
> >>>>>>>>>> you have an issue with unsigned types with the comparison.
> >>>>>>>>>> Also you should extend the -abs(A) pattern right below it in a 
> >>>>>>>>>> similar fashion.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Andrew Pinski
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>>>
> >>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A into 
> >>>>>>>>>>> abs (A) for half float.
> >>>>>>>>>>>
> >>>>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>>>
> >>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>
>

Re: [PATCH] MATCH: add abs support for half float

Reply via email to