Re: [PATCH] MATCH: add abs support for half float

Andrew Pinski Tue, 27 Aug 2024 21:23:56 -0700

On Tue, Aug 27, 2024 at 8:54 PM Kugan Vivekanandarajah
<kvivekana...@nvidia.com> wrote:
>
> Hi Richard,
>
> Thanks for the reply.
>
> > On 27 Aug 2024, at 7:05 pm, Richard Biener <richard.guent...@gmail.com> 
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Tue, Aug 27, 2024 at 8:23 AM Kugan Vivekanandarajah
> > <kvivekana...@nvidia.com> wrote:
> >>
> >> Hi Richard,
> >>
> >>> On 22 Aug 2024, at 10:34 pm, Richard Biener <richard.guent...@gmail.com> 
> >>> wrote:
> >>>
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> On Wed, Aug 21, 2024 at 12:08 PM Kugan Vivekanandarajah
> >>> <kvivekana...@nvidia.com> wrote:
> >>>>
> >>>> Hi Richard,
> >>>>
> >>>>> On 20 Aug 2024, at 6:09 pm, Richard Biener <richard.guent...@gmail.com> 
> >>>>> wrote:
> >>>>>
> >>>>> External email: Use caution opening links or attachments
> >>>>>
> >>>>>
> >>>>> On Fri, Aug 9, 2024 at 2:39 AM Kugan Vivekanandarajah
> >>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>
> >>>>>> Thanks for the comments.
> >>>>>>
> >>>>>>> On 2 Aug 2024, at 8:36 pm, Richard Biener 
> >>>>>>> <richard.guent...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> External email: Use caution opening links or attachments
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah
> >>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On 1 Aug 2024, at 10:46 pm, Richard Biener 
> >>>>>>>>> <richard.guent...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> External email: Use caution opening links or attachments
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
> >>>>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> 
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
> >>>>>>>>>>> <kugan...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
> >>>>>>>>>>>> <richard.guent...@gmail.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
> >>>>>>>>>>>>> <kugan...@gmail.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
> >>>>>>>>>>>>>> <richard.guent...@gmail.com> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
> >>>>>>>>>>>>>>> <kugan...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski 
> >>>>>>>>>>>>>>>> <pins...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
> >>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Revised based on the comment and moved it into existing 
> >>>>>>>>>>>>>>>>>> patterns as.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A 
> >>>>>>>>>>>>>>>>>> : -A.
> >>>>>>>>>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The testcase needs to make sure it runs only for targets 
> >>>>>>>>>>>>>>>>> that support
> >>>>>>>>>>>>>>>>> float16 so like:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> /* { dg-require-effective-target float16 } */
> >>>>>>>>>>>>>>>>> /* { dg-add-options float16 } */
> >>>>>>>>>>>>>>>> Added in the attached version.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>>>>>>>>> (for cmp (ge gt)
> >>>>>>>>>>>>>>> (simplify
> >>>>>>>>>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>>>>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>>>>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>>>>>>>>>> -        && bitwise_equal_p (@0, @1))
> >>>>>>>>>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>>>>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>>>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>>>>>>>>>>> +        && ((VECTOR_TYPE_P (type)
> >>>>>>>>>>>>>>> +             && tree_nop_conversion_p (TREE_TYPE (@0), 
> >>>>>>>>>>>>>>> TREE_TYPE (@1)))
> >>>>>>>>>>>>>>> +           || (!VECTOR_TYPE_P (type)
> >>>>>>>>>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
> >>>>>>>>>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> >>>>>>>>>>>>>>> +        && bitwise_equal_p (@1, @2))
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 
> >>>>>>>>>>>>>>> now
> >>>>>>>>>>>>>>> with the convert still applied to @1 - that looks odd.  You 
> >>>>>>>>>>>>>>> are allowing
> >>>>>>>>>>>>>>> sign-changing conversions but doesn't that change ge/gt 
> >>>>>>>>>>>>>>> behavior?
> >>>>>>>>>>>>>>> Also why are sign/zero-extensions not OK for vector types?
> >>>>>>>>>>>>>> Thanks for the review.
> >>>>>>>>>>>>>> My main motivation here is for _Float16  as below.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> _Float16 absfloat16 (_Float16 x)
> >>>>>>>>>>>>>> {
> >>>>>>>>>>>>>> float _1;
> >>>>>>>>>>>>>> _Float16 _2;
> >>>>>>>>>>>>>> _Float16 _4;
> >>>>>>>>>>>>>> <bb 2> [local count: 1073741824]:
> >>>>>>>>>>>>>> _1 = (float) x_3(D);
> >>>>>>>>>>>>>> if (_1 < 0.0)
> >>>>>>>>>>>>>> goto <bb 3>; [41.00%]
> >>>>>>>>>>>>>> else
> >>>>>>>>>>>>>> goto <bb 4>; [59.00%]
> >>>>>>>>>>>>>> <bb 3> [local count: 440234144]:\
> >>>>>>>>>>>>>> _4 = -x_3(D);
> >>>>>>>>>>>>>> <bb 4> [local count: 1073741824]:
> >>>>>>>>>>>>>> # _2 = PHI <_4(3), x_3(D)(2)>
> >>>>>>>>>>>>>> return _2;
> >>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is why I added  bitwise_equal_p test of @1 against @2 with
> >>>>>>>>>>>>>> TYPE_PRECISION checks.
> >>>>>>>>>>>>>> I agree that I will have to check for sign-changing 
> >>>>>>>>>>>>>> conversions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Just to keep it simple, I disallowed vector types. I am not 
> >>>>>>>>>>>>>> sure if
> >>>>>>>>>>>>>> this would  hit vec types. I am happy to handle this if that is
> >>>>>>>>>>>>>> needed.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think with __builtin_convertvector you should be able to 
> >>>>>>>>>>>>> construct
> >>>>>>>>>>>>> a testcase that does
> >>>>>>>>>>>> Thanks.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For the pattern,
> >>>>>>>>>>>> ```
> >>>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>>>>>> (for cmp (ge gt)
> >>>>>>>>>>>> (simplify
> >>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>>>>>>>     && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>>>>>>>     && bitwise_equal_p (@0, @1))
> >>>>>>>>>>>> (if (TYPE_UNSIGNED (type))
> >>>>>>>>>>>>  (absu:type @0)
> >>>>>>>>>>>>  (abs @0)))))
> >>>>>>>>>>>> ```
> >>>>>>>>>>>> the vector type doesn't seem right. For example, if we have a 4
> >>>>>>>>>>>> element vector with some negative and positive, I don't think  it
> >>>>>>>>>>>> makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
> >>>>>>>>>>>> missing it completely?
> >>>>>>>>>>>
> >>>>>>>>>>> Looks like I missed adding some vector testcases anyways here is 
> >>>>>>>>>>> one
> >>>>>>>>>>> to get this, note it is C++ due to the C front-end not support 
> >>>>>>>>>>> `?:`
> >>>>>>>>>>> for vectors yet (there is a patch).
> >>>>>>>>>>> ```
> >>>>>>>>>>> #define vect8 __attribute__((vector_size(8)))
> >>>>>>>>>>> vect8 int f(vect8 int a)
> >>>>>>>>>>> {
> >>>>>>>>>>> vect8 int na = -a;
> >>>>>>>>>>> return (a > 0) ? a : na;
> >>>>>>>>>>> }
> >>>>>>>>>>> ```
> >>>>>>>>>>> At -O2 before forwprop1, we have:
> >>>>>>>>>>> ```
> >>>>>>>>>>> vector(2) intD.9 a_2(D) = aD.2796;
> >>>>>>>>>>> vector(2) intD.9 naD.2799;
> >>>>>>>>>>> vector(2) <signed-boolean:32> _1;
> >>>>>>>>>>> vector(2) intD.9 _4;
> >>>>>>>>>>>
> >>>>>>>>>>> na_3 = -a_2(D);
> >>>>>>>>>>> _1 = a_2(D) > { 0, 0 };
> >>>>>>>>>>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
> >>>>>>>>>>> ```
> >>>>>>>>>>> And forwprop using match is able to do:
> >>>>>>>>>>> ```
> >>>>>>>>>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843
> >>>>>>>>>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
> >>>>>>>>>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 };
> >>>>>>>>>>> Removing dead stmt:na_3 = -a_2(D);
> >>>>>>>>>>> ```
> >>>>>>>>>>> (replace int with float and add  -fno-signed-zeros you can get 
> >>>>>>>>>>> the ABS also).
> >>>>>>>>>>>
> >>>>>>>>>>> Note comparisons with vector types always generate a vector 
> >>>>>>>>>>> boolean
> >>>>>>>>>>> type. So cond_expr will never show up with a vector comparison; 
> >>>>>>>>>>> only
> >>>>>>>>>>> vec_cond.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks for the information. I have revised the patch by adding 
> >>>>>>>>>> test case for the vector type. I will look at removing the 
> >>>>>>>>>> VEC_COND_EXPR as a follow up.
> >>>>>>>>>
> >>>>>>>>> This is likely a major task so do not get you distracted - I just 
> >>>>>>>>> mentioned it.
> >>>>>>>>>
> >>>>>>>>>> This is still rejected by the type checker.
> >>>>>>>>>> v2hi  absvect2 (v2hi x, int i) {
> >>>>>>>>>> v2hi neg = -x;
> >>>>>>>>>> v2si tmp = __builtin_convertvector (x, v2si);
> >>>>>>>>>> return (tmp > 0) ? x : neg;
> >>>>>>>>>> }
> >>>>>>>>>> as:
> >>>>>>>>>> error: incompatible vector types in conditional expression: 
> >>>>>>>>>> '__vector(2) int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' 
> >>>>>>>>>> {aka '__vector(2) short int'}
> >>>>>>>>>> 8 |     return (tmp > 0) ? x : neg;
> >>>>>>>>>> |            ~~~~~~~~~~^~~~~~~~~
> >>>>>>>>>>
> >>>>>>>>>> Also
> >>>>>>>>>>
> >>>>>>>>>> -      (absu:type @0)
> >>>>>>>>>> -      (abs @0)))))
> >>>>>>>>>> +      (absu:type @1)
> >>>>>>>>>> +      (abs @1)))))
> >>>>>>>>>>
> >>>>>>>>>> Changing the @1 to @2  is resulting in failures.  Existing tests 
> >>>>>>>>>> like  the following would be optimized away in this case.
> >>>>>>>>>> ```
> >>>>>>>>>> unsigned abs_with_convert0 (int x)
> >>>>>>>>>> {
> >>>>>>>>>> unsigned int y = x;
> >>>>>>>>>>
> >>>>>>>>>> if (x < 0)
> >>>>>>>>>>    y = -y;
> >>>>>>>>>>
> >>>>>>>>>> return y;
> >>>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> How so - the TYPE_UNSIGNED tests should make the pattern not 
> >>>>>>>>> trigger here?
> >>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>> This follows the same intent as the original pattern.
> >>>>>>>>>>
> >>>>>>>>>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this 
> >>>>>>>>>> OK for trunk.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>>>> -        && bitwise_equal_p (@0, @1))
> >>>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@0))
> >>>>>>>>>
> >>>>>>>>> I think the outer type could be allowed signed if ...
> >>>>>>>>>
> >>>>>>>>> +        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
> >>>>>>>>> +             && VECTOR_TYPE_P (TREE_TYPE (@1))
> >>>>>>>>> +             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
> >>>>>>>>> +                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
> >>>>>>>>>
> >>>>>>>>> this is always true
> >>>>>>>>>
> >>>>>>>>> +             && element_precision (TREE_TYPE (@1))
> >>>>>>>>> +                 <= element_precision (TREE_TYPE (@0)))
> >>>>>>>>> +           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
> >>>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
> >>>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> >>>>>>>>>
> >>>>>>>>> ... you make the precision strictly larger.  With both unsigned the 
> >>>>>>>>> same
> >>>>>>>>> precision case should have been stripped anyway.
> >>>>>>>>>
> >>>>>>>>> You can use element_precision for both vector and non-vector so I 
> >>>>>>>>> think this
> >>>>>>>>> should simplify to just checking element_precision.
> >>>>>>>>>
> >>>>>>>>> +      (absu:type @1)
> >>>>>>>>> +      (abs @1)))))
> >>>>>>>>>
> >>>>>>>>> I still think this needs to be @2 for type correctness.  @1 is only
> >>>>>>>>> bitwise equal,
> >>>>>>>>> it could have different sign.  I think we should drop 
> >>>>>>>>> bitwise_equal_p with the
> >>>>>>>>> convert now in and instead have a matching constraint.
> >>>>>>>>
> >>>>>>>> Thanks, so this should translate to:
> >>>>>>>>
> >>>>>>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>> (for cmp (ge gt)
> >>>>>>>> (simplify
> >>>>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>>>
> >>>>>>> Oh, I mis-read this as unsigned, so this makes the conversions
> >>>>>>> to always sign-extend which means it's important to ensure
> >>>>>>> the type of @0 is also signed.
> >>>>>>>
> >>>>>>>>      && !TYPE_UNSIGNED (TREE_TYPE (@2))
> >>>>>>>>      && ((element_precision (TREE_TYPE (@1))
> >>>>>>>>             < element_precision (TREE_TYPE (@0))
> >>>>>>>>           || operand_equal_p (@1, @0)))
> >>>>>>>
> >>>>>>> which means <= might be better then.
> >>>>>>>
> >>>>>>>
> >>>>>>>>      && bitwise_equal_p (@1, @2))
> >>>>>>>>  (if (TYPE_UNSIGNED (type))
> >>>>>>>>   (absu:type @2)
> >>>>>>>>   (abs @2)))))
> >>>>>>>>
> >>>>>>>> However with this, I am seeing: Note that the @2 for these are 
> >>>>>>>> unsigned.
> >>>>>>>
> >>>>>>> I see, so we rely on @1 being the signed equivalent of @2 which might 
> >>>>>>> or
> >>>>>>> might not be signed.  I guess that indeed we want @2 here.
> >>>>>> Sorry I am not sure I understand this. When @2 is unsigned, ABS_EXPR 
> >>>>>> and ABSU_EXPR is not
> >>>>>> For example:
> >>>>>> With
> >>>>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>> (for cmp (ge gt)
> >>>>>> (simplify
> >>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>  (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@0))
> >>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>>       && element_precision (TREE_TYPE (@1))
> >>>>>>              <= element_precision (TREE_TYPE (@0))
> >>>>>>       && bitwise_equal_p (@1, @2))
> >>>>>>   (if (TYPE_UNSIGNED (type))
> >>>>>>    (absu:type @2)
> >>>>>>    (abs @2)))))
> >>>>>>
> >>>>>> The test case below becomes:
> >>>>>>
> >>>>>> unsigned abs_with_convert0 (int x)
> >>>>>> {
> >>>>>>  unsigned int y = x;
> >>>>>>
> >>>>>>  if (x < 0)
> >>>>>>      y = -y;
> >>>>>>
> >>>>>> return y;
> >>>>>> }
> >>>>>>
> >>>>>> unsigned int abs_with_convert0 (int x)
> >>>>>> {
> >>>>>> unsigned int y;
> >>>>>>
> >>>>>> <bb 2> :
> >>>>>> y_3 = (unsigned int) x_2(D);
> >>>>>> if (x_2(D) < 0)
> >>>>>>  goto <bb 3>; [INV]
> >>>>>> else
> >>>>>>  goto <bb 4>; [INV]
> >>>>>>
> >>>>>> <bb 3> :
> >>>>>> y_4 = -y_3;
> >>>>>>
> >>>>>> <bb 4> :
> >>>>>> # y_1 = PHI <y_3(2), y_4(3)>
> >>>>>> return y_1;
> >>>>>>
> >>>>>> }
> >>>>>>
> >>>>>> To:
> >>>>>>
> >>>>>>
> >>>>>> unsigned int abs_with_convert0 (int x)
> >>>>>> {
> >>>>>> unsigned int y;
> >>>>>> unsigned int _5;
> >>>>>>
> >>>>>> <bb 2> :
> >>>>>> y_3 = (unsigned int) x_2(D);
> >>>>>> _5 = ABSU_EXPR <y_3>;
> >>>>>> return _5;
> >>>>>>
> >>>>>> }
> >>>>>>
> >>>>>> This is an invalid gimple. Are you suggesting that we should also 
> >>>>>> check if @2 is not UNSIGNED? If that is what you want, couple of test 
> >>>>>> cases in the test suite including phi-opt-37.c would fail.
> >>>>>
> >>>>> Trying to swap in the discussion after vacation ... iff @2 might be
> >>>>> unsigned then you'd need
> >>>>> a conversion to signed to make 'abs' make sense.
> >>>>>
> >>>>>> (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>>>  (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@0))
> >>>>>>       && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>>
> >>>>> so this doesn't put any constraints on the signedness of @2
> >>>>>
> >>>>> What kind of extensions are valid on @1?  I think this warrants a
> >>>>> comment.  I think in the end the compare should be signed to
> >>>>> fit 'abs' semantics, but a zero-extension (unsigned @1) should be
> >>>>> OK?  So why constrain the sign of @1?
> >>>> I thing it is the conversion to signed (sign extend) that we want to 
> >>>> support, But for cases like:
> >>>> signed abs_with_convert1 (unsigned x)
> >>>> {
> >>>>   int y = x;
> >>>>
> >>>>   if (y < 0)
> >>>>       y = -x;
> >>>>
> >>>> return y;
> >>>> }
> >>>> @2 is  unsigned.
> >>>>
> >>>>>>       && element_precision (TREE_TYPE (@1))
> >>>>>>              <= element_precision (TREE_TYPE (@0))
> >>>>>>       && bitwise_equal_p (@1, @2))
> >>>>>>   (if (TYPE_UNSIGNED (type))
> >>>>>>    (absu:type @2)
> >>>>>>    (abs @2)))))
> >>>>>
> >>>>> Of course then you'd want to convert @2 to signed?
> >>>>
> >>>> We have some test cases where @2 is unsigned where original pattern 
> >>>> triggers and will not with the new pattern. For example 
> >>>> gcc.dg/tree-ssa/phi-opt-37.c.
> >>>>
> >>>> Should we create a new pattern as:
> >>>> /* (type)A >=/> 0 ? A : -A    same as abs (A) */
> >>>> (for cmp (ge gt)
> >>>> (simplify
> >>>>  (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> >>>>   (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> >>>>        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> >>>>        && !TYPE_UNSIGNED (TREE_TYPE (@2))
> >>>>        && element_precision (TREE_TYPE (@1))
> >>>>               <= element_precision (TREE_TYPE (@0))
> >>>>        && bitwise_equal_p (@1, @2))
> >>>>    (if (TYPE_UNSIGNED (type))
> >>>>     (abs @2)
> >>>>     (abs @2)))))
> >>>>
> >>>> And leave the old pattern as it is? Otherwise if we decide to use @2, we 
> >>>> have to XFAIL these cases unless I am missing spmething.
> >>>
> >>> No, I think we want a single pattern.  I meant you possibly need to do
> >>>
> >>> (with
> >>> { tree stype = signed_type_for (TREE_TYPE (@2)); }
> >>> (if (TYPE_UNSIGNED (type))
> >>>  (absu (convert:stype @2))
> >>>  (abs (convert:stype @2))))
> >>>
> >>> for when @2 is unsigned?
> >>
> >> Thanks. I have changed accordingly.
> >>
> >> I have also changed gcc.dg/tree-ssa/phi-opt-37.c to check phiopt2 as in 
> >> phiopt1 we don’t now match because of:
> >> phiopt match-simplify back:
> >> _5 = (signed int) x_2(D);
> >> _7 = ABSU_EXPR <_5>;
> >> result: _7
> >> rejected because early
> >
> > ?  I don't understand what you are saying.
>
> This is the dump from phiopt. Even though it is matching,
> /* Return TRUE if SEQ/OP pair should be allowed during early phiopt.
>    Currently this is to allow MIN/MAX and ABS/NEGATE and constants.  */
> static bool
> phiopt_early_allow (gimple_seq &seq, gimple_match_op &op) is rejecting it, as 
> the dump shows.


So I looked into this slightly. I think the match patterns are greedy
when it comes to optional converts, in that match tries to match with
convert first and for integer types but bitwise_equal_p will handle a
nop_convert here.
So for scalar integer types, maybe skip `TYPE_PRECISION (TREE_TYPE
(@1)) == TYPE_PRECISION (TREE_TYPE (@0))` as that is handled via
bitwise_equal_p call and we don't want an extra convert being added by
the simplification since it is already there as we might not recover
an usage of that until much later. An example is if the ssa name of
the convert is used twice and we now have 2 ssa names with the same
convert.

Thanks,
Andrew Pinski


>
> >
> >> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK?
> >
> > In both cases you do
> >
> > -   (cnd (cmp @0 zerop) @1 (negate @1))
> > -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> > -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> > ...
> > +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> > +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> > +        /* Support SEXT of @0 only.  */
> > +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> > +        && element_precision (TREE_TYPE (@1))
> > +               <= element_precision (TREE_TYPE (@0))
> >
> > where formerly we require a signed comparison, a check that you drop
> > and one that IMO is important (I would guess for unsigned we never
> > match the pattern as compares against zero are simplified to eq/ne,
> > but still).  I think that
>
> From what I understand, we allow a signed comparison even now.
> We don’t allow unsigned comparison agains zero as earlier. We allow 
> conversion of unsigned type to signed type before comparing. Am I missing 
> something?
>
> >
> > unsigned a;
> > (int)a < 0 : -a : a
>
> >
> > is valid to be matched to ABSU but is excluded by the requirement
> > of a sign-extension.
>
> This is currently handled as :
> unsigned int abs1 (unsigned int x)
> {
>   signed int _6;
>   unsigned int _7;
>
>   <bb 2> [local count: 1073741824]:
>   _6 = (signed int) x_3(D);
>   _7 = ABSU_EXPR <_6>;
>   return _7;
>
> }
>
> >
> > unsigned a;
> > (long)a < 0 : -a : a
> >
> > isn't a ABSU though.  That means
>
> Yes, this gets folded in .015.cfg as:
>
> Removing basic block 3
> Merging blocks 2 and 4
> Merging blocks 2 and 5
> ;; 1 loops found
> ;;
> ;; Loop 0
> ;;  header 0, latch 1
> ;;  depth 0, outer -1
> ;;  nodes: 0 1 2
> ;; 2 succs { 1 }
> unsigned int abs2 (unsigned int x)
> {
>   unsigned int D.4427;
>
>   <bb 2> :
>   D.4427 = x;
>   // predicted unlikely by early return (on trees) predictor.
>   return D.4427;
>
> }
>
>
> Thanks,
> Kugan
> >
> >  /* Support sign-change or SEXT of @0 only.  */
> >  && (element_precision (TREE_TYPE (@1)) == element_precision (TREE_TYPE 
> > (@0))
> >         || (!TYPE_UNSIGNED (...)
> >             && ...))
> >
> > would be better.  That might also solve your problem with phi-opt-37.c?
> >
> > Richard.
> >
> >
> >> Thanks,
> >> Kugan
> >>
> >>
> >>
> >>>
> >>> Richard.
> >>>
> >>>
> >>>> Thanks,
> >>>> Kugan
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Kugan
> >>>>>>
> >>>>>>>
> >>>>>>>> Tests that now fail, but worked before (4 tests):
> >>>>>>>>
> >>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if "
> >>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 
> >>>>>>>> 2
> >>>>>>>> gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if 
> >>>>>>>> "gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 
> >>>>>>>> "ABSU_EXPR <" 2
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> unsigned f2(int A)
> >>>>>>>> {
> >>>>>>>> unsigned t = A;
> >>>>>>>> //     A >= 0? A : -A    same as abs (A)
> >>>>>>>> if (A >= 0)  return t;
> >>>>>>>> return -t;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> unsigned abs_with_convert0 (int x)
> >>>>>>>> {
> >>>>>>>> unsigned int y = x;
> >>>>>>>>
> >>>>>>>> if (x < 0)
> >>>>>>>>     y = -y;
> >>>>>>>>
> >>>>>>>> return y;
> >>>>>>>> }
> >>>>>>>> unsigned abs_with_convert1 (unsigned x)
> >>>>>>>> {
> >>>>>>>> int y = x;
> >>>>>>>>
> >>>>>>>> if (y < 0)
> >>>>>>>>     x = -x;
> >>>>>>>>
> >>>>>>>> return x;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> In the original pattern, we have check for  !TYPE_UNSIGNED 
> >>>>>>>> (TREE_TYPE(@0)) and  (absu:type @0)
> >>>>>>>>
> >>>>>>>> Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and  
> >>>>>>>> (absu:type @1) in the new pattern
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Kugan
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Same comments apply to the 2nd pattern update.
> >>>>>>>>>
> >>>>>>>>> Richard.
> >>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Kugan
> >>>>>>>>>>
> >>>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>>
> >>>>>>>>>>    * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
> >>>>>>>>>>    Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
> >>>>>>>>>>
> >>>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>>
> >>>>>>>>>>    * g++.dg/absvect.C: New test.
> >>>>>>>>>>    * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
> >>>>>>>>>> codes by instead looking whether the condition is vector or 
> >>>>>>>>>> scalar.  Fallout
> >>>>>>>>>> might be a bit too big, but it might be a thing to try.
> >>>>>>>>>>
> >>>>>>>>>> Richard.
> >>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Andrew Pinski
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Kugan
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> +      (absu:type @1)
> >>>>>>>>>>>>>>> +      (abs @1)))))
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think this should use @2 now.
> >>>>>>>>>>>>>> I will change this.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Kugan
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks.
> >>>>>>>>>>>>>>>> Kugan
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Other than that it looks good but I can't approve it.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>> Andrew Pinski
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah 
> >>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is 
> >>>>>>>>>>>>>>>>>> this OK for trunk?
> >>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>> Kugan
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> ________________________________
> >>>>>>>>>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com>
> >>>>>>>>>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM
> >>>>>>>>>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
> >>>>>>>>>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
> >>>>>>>>>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com>
> >>>>>>>>>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> External email: Use caution opening links or attachments
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
> >>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> This patch extends abs detection in matched for half 
> >>>>>>>>>>>>>>>>>>> float.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is 
> >>>>>>>>>>>>>>>>>>> this OK for trunk?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This is basically this pattern:
> >>>>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
> >>>>>>>>>>>>>>>>>> (for cmp (ge gt)
> >>>>>>>>>>>>>>>>>> (simplify
> >>>>>>>>>>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1))
> >>>>>>>>>>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> >>>>>>>>>>>>>>>>>>     && !TYPE_UNSIGNED (TREE_TYPE(@0))
> >>>>>>>>>>>>>>>>>>     && bitwise_equal_p (@0, @1))
> >>>>>>>>>>>>>>>>>> (if (TYPE_UNSIGNED (type))
> >>>>>>>>>>>>>>>>>>  (absu:type @0)
> >>>>>>>>>>>>>>>>>>  (abs @0)))))
> >>>>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> except extended to handle an optional convert. Why didn't 
> >>>>>>>>>>>>>>>>>> you just
> >>>>>>>>>>>>>>>>>> extend the above pattern to handle the convert instead? 
> >>>>>>>>>>>>>>>>>> Also I think
> >>>>>>>>>>>>>>>>>> you have an issue with unsigned types with the comparison.
> >>>>>>>>>>>>>>>>>> Also you should extend the -abs(A) pattern right below it 
> >>>>>>>>>>>>>>>>>> in a similar fashion.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>> Andrew Pinski
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> gcc/ChangeLog:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : 
> >>>>>>>>>>>>>>>>>>> -A into abs (A) for half float.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah 
> >>>>>>>>>>>>>>>>>>> <kvivekana...@nvidia.com>
> >>
> >>
>

Re: [PATCH] MATCH: add abs support for half float

Reply via email to