Re: [PATCH] MATCH: add abs support for half float

Richard Biener Thu, 01 Aug 2024 05:47:30 -0700

On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
<kvivekana...@nvidia.com> wrote:
>
>
> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote:
> >
> > On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
> > <kugan...@gmail.com> wrote:
> > >
> > > On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
> > > <richard.guent...@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
> > > > <kugan...@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
> > > > > <richard.guent...@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
> > > > > > <kugan...@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski 
> > > > > > > <pins...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
> > > > > > > > <kvivekana...@nvidia.com> wrote:
> > > > > > > > >
> > > > > > > > > Revised based on the comment and moved it into existing 
> > > > > > > > > patterns as.
> > > > > > > > >
> > > > > > > > > gcc/ChangeLog:
> > > > > > > > >
> > > > > > > > > * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : 
> > > > > > > > > -A.
> > > > > > > > > Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
> > > > > > > > >
> > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > >
> > > > > > > > > * gcc.dg/tree-ssa/absfloat16.c: New test.
> > > > > > > >
> > > > > > > > The testcase needs to make sure it runs only for targets that 
> > > > > > > > support
> > > > > > > > float16 so like:
> > > > > > > >
> > > > > > > > /* { dg-require-effective-target float16 } */
> > > > > > > > /* { dg-add-options float16 } */
> > > > > > > Added in the attached version.
> > > > > >
> > > > > > + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> > > > > >   (for cmp (ge gt)
> > > > > >    (simplify
> > > > > > -   (cnd (cmp @0 zerop) @1 (negate @1))
> > > > > > -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> > > > > > -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> > > > > > -        && bitwise_equal_p (@0, @1))
> > > > > > +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> > > > > > +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> > > > > > +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> > > > > > +        && ((VECTOR_TYPE_P (type)
> > > > > > +             && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE 
> > > > > > (@1)))
> > > > > > +           || (!VECTOR_TYPE_P (type)
> > > > > > +               && (TYPE_PRECISION (TREE_TYPE (@1))
> > > > > > +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> > > > > > +        && bitwise_equal_p (@1, @2))
> > > > > >
> > > > > > I wonder about the bitwise_equal_p which tests @1 against @2 now
> > > > > > with the convert still applied to @1 - that looks odd.  You are 
> > > > > > allowing
> > > > > > sign-changing conversions but doesn't that change ge/gt behavior?
> > > > > > Also why are sign/zero-extensions not OK for vector types?
> > > > > Thanks for the review.
> > > > > My main motivation here is for _Float16  as below.
> > > > >
> > > > > _Float16 absfloat16 (_Float16 x)
> > > > > {
> > > > >   float _1;
> > > > >   _Float16 _2;
> > > > >   _Float16 _4;
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = (float) x_3(D);
> > > > >   if (_1 < 0.0)
> > > > >     goto <bb 3>; [41.00%]
> > > > >   else
> > > > >     goto <bb 4>; [59.00%]
> > > > >   <bb 3> [local count: 440234144]:\
> > > > >   _4 = -x_3(D);
> > > > >   <bb 4> [local count: 1073741824]:
> > > > >   # _2 = PHI <_4(3), x_3(D)(2)>
> > > > >   return _2;
> > > > > }
> > > > >
> > > > > This is why I added  bitwise_equal_p test of @1 against @2 with
> > > > > TYPE_PRECISION checks.
> > > > > I agree that I will have to check for sign-changing conversions.
> > > > >
> > > > > Just to keep it simple, I disallowed vector types. I am not sure if
> > > > > this would  hit vec types. I am happy to handle this if that is
> > > > > needed.
> > > >
> > > > I think with __builtin_convertvector you should be able to construct
> > > > a testcase that does
> > > Thanks.
> > >
> > > For the pattern,
> > > ```
> > >  /* A >=/> 0 ? A : -A    same as abs (A) */
> > >  (for cmp (ge gt)
> > >   (simplify
> > >    (cnd (cmp @0 zerop) @1 (negate @1))
> > >     (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> > >          && !TYPE_UNSIGNED (TREE_TYPE(@0))
> > >          && bitwise_equal_p (@0, @1))
> > >      (if (TYPE_UNSIGNED (type))
> > >       (absu:type @0)
> > >       (abs @0)))))
> > > ```
> > > the vector type doesn't seem right. For example, if we have a 4
> > > element vector with some negative and positive, I don't think  it
> > > makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
> > > missing it completely?
> >
> > Looks like I missed adding some vector testcases anyways here is one
> > to get this, note it is C++ due to the C front-end not support `?:`
> > for vectors yet (there is a patch).
> > ```
> > #define vect8 __attribute__((vector_size(8)))
> > vect8 int f(vect8 int a)
> > {
> >   vect8 int na = -a;
> >   return (a > 0) ? a : na;
> > }
> > ```
> > At -O2 before forwprop1, we have:
> > ```
> >   vector(2) intD.9 a_2(D) = aD.2796;
> >   vector(2) intD.9 naD.2799;
> >   vector(2) <signed-boolean:32> _1;
> >   vector(2) intD.9 _4;
> >
> >   na_3 = -a_2(D);
> >   _1 = a_2(D) > { 0, 0 };
> >   _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
> > ```
> > And forwprop using match is able to do:
> > ```
> > Applying pattern match.pd:6306, gimple-match-10.cc:19843
> > gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
> > Removing dead stmt:_1 = a_2(D) > { 0, 0 };
> > Removing dead stmt:na_3 = -a_2(D);
> > ```
> > (replace int with float and add  -fno-signed-zeros you can get the ABS 
> > also).
> >
> > Note comparisons with vector types always generate a vector boolean
> > type. So cond_expr will never show up with a vector comparison; only
> > vec_cond.
>
>
> Thanks for the information. I have revised the patch by adding test case for 
> the vector type. I will look at removing the VEC_COND_EXPR as a follow up.


This is likely a major task so do not get you distracted - I just mentioned it.

> This is still rejected by the type checker.
> v2hi  absvect2 (v2hi x, int i) {
>     v2hi neg = -x;
>     v2si tmp = __builtin_convertvector (x, v2si);
>     return (tmp > 0) ? x : neg;
> }
> as:
> error: incompatible vector types in conditional expression: '__vector(2) 
> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) short 
> int'}
> 8 |     return (tmp > 0) ? x : neg;
>     |            ~~~~~~~~~~^~~~~~~~~
>
> Also
>
> -      (absu:type @0)
> -      (abs @0)))))
> +      (absu:type @1)
> +      (abs @1)))))
>
> Changing the @1 to @2  is resulting in failures.  Existing tests like  the 
> following would be optimized away in this case.
> ```
> unsigned abs_with_convert0 (int x)
> {
>     unsigned int y = x;
>
>     if (x < 0)
>         y = -y;
>
>   return y;
> }

How so - the TYPE_UNSIGNED tests should make the pattern not trigger here?

> ```
> This follows the same intent as the original pattern.
>
> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for trunk.



-   (cnd (cmp @0 zerop) @1 (negate @1))
-    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
-        && !TYPE_UNSIGNED (TREE_TYPE(@0))
-        && bitwise_equal_p (@0, @1))
+   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
+    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
+        && !TYPE_UNSIGNED (TREE_TYPE (@1))
+        && !TYPE_UNSIGNED (TREE_TYPE (@0))

I think the outer type could be allowed signed if ...

+        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
+             && VECTOR_TYPE_P (TREE_TYPE (@1))
+             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
+                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))

this is always true

+             && element_precision (TREE_TYPE (@1))
+                 <= element_precision (TREE_TYPE (@0)))
+           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
+               && (TYPE_PRECISION (TREE_TYPE (@1))
+                   <= TYPE_PRECISION (TREE_TYPE (@0)))))

... you make the precision strictly larger.  With both unsigned the same
precision case should have been stripped anyway.

You can use element_precision for both vector and non-vector so I think this
should simplify to just checking element_precision.

+      (absu:type @1)
+      (abs @1)))))

I still think this needs to be @2 for type correctness.  @1 is only
bitwise equal,
it could have different sign.  I think we should drop bitwise_equal_p with the
convert now in and instead have a matching constraint.

Same comments apply to the 2nd pattern update.

Richard.

> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
>         * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>         Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.dg/absvect.C: New test.
>         * gcc.dg/tree-ssa/absfloat16.c: New test.
>
>
> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>
>
> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
> codes by instead looking whether the condition is vector or scalar.  Fallout
> might be a bit too big, but it might be a thing to try.
>
> Richard.
>
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > Thanks,
> > > Kugan
> > >
> > > >
> > > > >
> > > > > >
> > > > > > +      (absu:type @1)
> > > > > > +      (abs @1)))))
> > > > > >
> > > > > > I think this should use @2 now.
> > > > > I will change this.
> > > > >
> > > > > Thanks,
> > > > > Kugan
> > > > >
> > > > > >
> > > > > > > Thanks.
> > > > > > > Kugan
> > > > > > > >
> > > > > > > > (like what is in gcc.dg/c11-floatn-3.c and others).
> > > > > > > >
> > > > > > > > Other than that it looks good but I can't approve it.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Andrew Pinski
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Signed-off-by: Kugan Vivekanandarajah 
> > > > > > > > > <kvivekana...@nvidia.com>
> > > > > > > > >
> > > > > > > > > Bootstrapped and regression test on aarch64-linux-gnu. Is 
> > > > > > > > > this OK for trunk?
> > > > > > > > > Thanks,
> > > > > > > > > Kugan
> > > > > > > > >
> > > > > > > > > ________________________________
> > > > > > > > > From: Andrew Pinski <pins...@gmail.com>
> > > > > > > > > Sent: Monday, 15 July 2024 5:30 AM
> > > > > > > > > To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> > > > > > > > > Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
> > > > > > > > > richard.guent...@gmail.com <richard.guent...@gmail.com>
> > > > > > > > > Subject: Re: [PATCH] MATCH: add abs support for half float
> > > > > > > > >
> > > > > > > > > External email: Use caution opening links or attachments
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
> > > > > > > > > <kvivekana...@nvidia.com> wrote:
> > > > > > > > > >
> > > > > > > > > > This patch extends abs detection in matched for half float.
> > > > > > > > > >
> > > > > > > > > > Bootstrapped and regression test on aarch64-linux-gnu. Is 
> > > > > > > > > > this OK for trunk?
> > > > > > > > >
> > > > > > > > > This is basically this pattern:
> > > > > > > > > ```
> > > > > > > > >  /* A >=/> 0 ? A : -A    same as abs (A) */
> > > > > > > > >  (for cmp (ge gt)
> > > > > > > > >   (simplify
> > > > > > > > >    (cnd (cmp @0 zerop) @1 (negate @1))
> > > > > > > > >     (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> > > > > > > > >          && !TYPE_UNSIGNED (TREE_TYPE(@0))
> > > > > > > > >          && bitwise_equal_p (@0, @1))
> > > > > > > > >      (if (TYPE_UNSIGNED (type))
> > > > > > > > >       (absu:type @0)
> > > > > > > > >       (abs @0)))))
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > except extended to handle an optional convert. Why didn't you 
> > > > > > > > > just
> > > > > > > > > extend the above pattern to handle the convert instead? Also 
> > > > > > > > > I think
> > > > > > > > > you have an issue with unsigned types with the comparison.
> > > > > > > > > Also you should extend the -abs(A) pattern right below it in 
> > > > > > > > > a similar fashion.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Andrew Pinski
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > gcc/ChangeLog:
> > > > > > > > > >
> > > > > > > > > > * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A 
> > > > > > > > > > into abs (A) for half float.
> > > > > > > > > >
> > > > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > > > >
> > > > > > > > > > * gcc.dg/tree-ssa/absfloat16.c: New test.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Kugan Vivekanandarajah 
> > > > > > > > > > <kvivekana...@nvidia.com>
> > > > > > > > > >

Re: [PATCH] MATCH: add abs support for half float

Reply via email to