On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah <kvivekana...@nvidia.com> wrote: > > > > > On 1 Aug 2024, at 10:46 pm, Richard Biener <richard.guent...@gmail.com> > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah > > <kvivekana...@nvidia.com> wrote: > >> > >> > >> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote: > >>> > >>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah > >>> <kugan...@gmail.com> wrote: > >>>> > >>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener > >>>> <richard.guent...@gmail.com> wrote: > >>>>> > >>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah > >>>>> <kugan...@gmail.com> wrote: > >>>>>> > >>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener > >>>>>> <richard.guent...@gmail.com> wrote: > >>>>>>> > >>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah > >>>>>>> <kugan...@gmail.com> wrote: > >>>>>>>> > >>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski <pins...@gmail.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah > >>>>>>>>> <kvivekana...@nvidia.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Revised based on the comment and moved it into existing patterns > >>>>>>>>>> as. > >>>>>>>>>> > >>>>>>>>>> gcc/ChangeLog: > >>>>>>>>>> > >>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A. > >>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. > >>>>>>>>>> > >>>>>>>>>> gcc/testsuite/ChangeLog: > >>>>>>>>>> > >>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. > >>>>>>>>> > >>>>>>>>> The testcase needs to make sure it runs only for targets that > >>>>>>>>> support > >>>>>>>>> float16 so like: > >>>>>>>>> > >>>>>>>>> /* { dg-require-effective-target float16 } */ > >>>>>>>>> /* { dg-add-options float16 } */ > >>>>>>>> Added in the attached version. > >>>>>>> > >>>>>>> + /* (type)A >=/> 0 ? A : -A same as abs (A) */ > >>>>>>> (for cmp (ge gt) > >>>>>>> (simplify > >>>>>>> - (cnd (cmp @0 zerop) @1 (negate @1)) > >>>>>>> - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) > >>>>>>> - && !TYPE_UNSIGNED (TREE_TYPE(@0)) > >>>>>>> - && bitwise_equal_p (@0, @1)) > >>>>>>> + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) > >>>>>>> + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) > >>>>>>> + && !TYPE_UNSIGNED (TREE_TYPE (@1)) > >>>>>>> + && ((VECTOR_TYPE_P (type) > >>>>>>> + && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE > >>>>>>> (@1))) > >>>>>>> + || (!VECTOR_TYPE_P (type) > >>>>>>> + && (TYPE_PRECISION (TREE_TYPE (@1)) > >>>>>>> + <= TYPE_PRECISION (TREE_TYPE (@0))))) > >>>>>>> + && bitwise_equal_p (@1, @2)) > >>>>>>> > >>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 now > >>>>>>> with the convert still applied to @1 - that looks odd. You are > >>>>>>> allowing > >>>>>>> sign-changing conversions but doesn't that change ge/gt behavior? > >>>>>>> Also why are sign/zero-extensions not OK for vector types? > >>>>>> Thanks for the review. > >>>>>> My main motivation here is for _Float16 as below. > >>>>>> > >>>>>> _Float16 absfloat16 (_Float16 x) > >>>>>> { > >>>>>> float _1; > >>>>>> _Float16 _2; > >>>>>> _Float16 _4; > >>>>>> <bb 2> [local count: 1073741824]: > >>>>>> _1 = (float) x_3(D); > >>>>>> if (_1 < 0.0) > >>>>>> goto <bb 3>; [41.00%] > >>>>>> else > >>>>>> goto <bb 4>; [59.00%] > >>>>>> <bb 3> [local count: 440234144]:\ > >>>>>> _4 = -x_3(D); > >>>>>> <bb 4> [local count: 1073741824]: > >>>>>> # _2 = PHI <_4(3), x_3(D)(2)> > >>>>>> return _2; > >>>>>> } > >>>>>> > >>>>>> This is why I added bitwise_equal_p test of @1 against @2 with > >>>>>> TYPE_PRECISION checks. > >>>>>> I agree that I will have to check for sign-changing conversions. > >>>>>> > >>>>>> Just to keep it simple, I disallowed vector types. I am not sure if > >>>>>> this would hit vec types. I am happy to handle this if that is > >>>>>> needed. > >>>>> > >>>>> I think with __builtin_convertvector you should be able to construct > >>>>> a testcase that does > >>>> Thanks. > >>>> > >>>> For the pattern, > >>>> ``` > >>>> /* A >=/> 0 ? A : -A same as abs (A) */ > >>>> (for cmp (ge gt) > >>>> (simplify > >>>> (cnd (cmp @0 zerop) @1 (negate @1)) > >>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) > >>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) > >>>> && bitwise_equal_p (@0, @1)) > >>>> (if (TYPE_UNSIGNED (type)) > >>>> (absu:type @0) > >>>> (abs @0))))) > >>>> ``` > >>>> the vector type doesn't seem right. For example, if we have a 4 > >>>> element vector with some negative and positive, I don't think it > >>>> makes sense. Also, we dont seem to generate (cmp @0 zerop). Am I > >>>> missing it completely? > >>> > >>> Looks like I missed adding some vector testcases anyways here is one > >>> to get this, note it is C++ due to the C front-end not support `?:` > >>> for vectors yet (there is a patch). > >>> ``` > >>> #define vect8 __attribute__((vector_size(8))) > >>> vect8 int f(vect8 int a) > >>> { > >>> vect8 int na = -a; > >>> return (a > 0) ? a : na; > >>> } > >>> ``` > >>> At -O2 before forwprop1, we have: > >>> ``` > >>> vector(2) intD.9 a_2(D) = aD.2796; > >>> vector(2) intD.9 naD.2799; > >>> vector(2) <signed-boolean:32> _1; > >>> vector(2) intD.9 _4; > >>> > >>> na_3 = -a_2(D); > >>> _1 = a_2(D) > { 0, 0 }; > >>> _4 = VEC_COND_EXPR <_1, a_2(D), na_3>; > >>> ``` > >>> And forwprop using match is able to do: > >>> ``` > >>> Applying pattern match.pd:6306, gimple-match-10.cc:19843 > >>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>; > >>> Removing dead stmt:_1 = a_2(D) > { 0, 0 }; > >>> Removing dead stmt:na_3 = -a_2(D); > >>> ``` > >>> (replace int with float and add -fno-signed-zeros you can get the ABS > >>> also). > >>> > >>> Note comparisons with vector types always generate a vector boolean > >>> type. So cond_expr will never show up with a vector comparison; only > >>> vec_cond. > >> > >> > >> Thanks for the information. I have revised the patch by adding test case > >> for the vector type. I will look at removing the VEC_COND_EXPR as a follow > >> up. > > > > This is likely a major task so do not get you distracted - I just mentioned > > it. > > > >> This is still rejected by the type checker. > >> v2hi absvect2 (v2hi x, int i) { > >> v2hi neg = -x; > >> v2si tmp = __builtin_convertvector (x, v2si); > >> return (tmp > 0) ? x : neg; > >> } > >> as: > >> error: incompatible vector types in conditional expression: '__vector(2) > >> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) > >> short int'} > >> 8 | return (tmp > 0) ? x : neg; > >> | ~~~~~~~~~~^~~~~~~~~ > >> > >> Also > >> > >> - (absu:type @0) > >> - (abs @0))))) > >> + (absu:type @1) > >> + (abs @1))))) > >> > >> Changing the @1 to @2 is resulting in failures. Existing tests like the > >> following would be optimized away in this case. > >> ``` > >> unsigned abs_with_convert0 (int x) > >> { > >> unsigned int y = x; > >> > >> if (x < 0) > >> y = -y; > >> > >> return y; > >> } > > > > How so - the TYPE_UNSIGNED tests should make the pattern not trigger here? > > > >> ``` > >> This follows the same intent as the original pattern. > >> > >> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for > >> trunk. > > > > > > > > - (cnd (cmp @0 zerop) @1 (negate @1)) > > - (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) > > - && !TYPE_UNSIGNED (TREE_TYPE(@0)) > > - && bitwise_equal_p (@0, @1)) > > + (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) > > + (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) > > + && !TYPE_UNSIGNED (TREE_TYPE (@1)) > > + && !TYPE_UNSIGNED (TREE_TYPE (@0)) > > > > I think the outer type could be allowed signed if ... > > > > + && ((VECTOR_TYPE_P (TREE_TYPE (@0)) > > + && VECTOR_TYPE_P (TREE_TYPE (@1)) > > + && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)), > > + TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1))) > > > > this is always true > > > > + && element_precision (TREE_TYPE (@1)) > > + <= element_precision (TREE_TYPE (@0))) > > + || (!VECTOR_TYPE_P (TREE_TYPE (@0)) > > + && (TYPE_PRECISION (TREE_TYPE (@1)) > > + <= TYPE_PRECISION (TREE_TYPE (@0))))) > > > > ... you make the precision strictly larger. With both unsigned the same > > precision case should have been stripped anyway. > > > > You can use element_precision for both vector and non-vector so I think this > > should simplify to just checking element_precision. > > > > + (absu:type @1) > > + (abs @1))))) > > > > I still think this needs to be @2 for type correctness. @1 is only > > bitwise equal, > > it could have different sign. I think we should drop bitwise_equal_p with > > the > > convert now in and instead have a matching constraint. > > Thanks, so this should translate to: > > /* (type)A >=/> 0 ? A : -A same as abs (A) */ > (for cmp (ge gt) > (simplify > (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2)) > (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1)) > && !TYPE_UNSIGNED (TREE_TYPE (@1))
Oh, I mis-read this as unsigned, so this makes the conversions to always sign-extend which means it's important to ensure the type of @0 is also signed. > && !TYPE_UNSIGNED (TREE_TYPE (@2)) > && ((element_precision (TREE_TYPE (@1)) > < element_precision (TREE_TYPE (@0)) > || operand_equal_p (@1, @0))) which means <= might be better then. > && bitwise_equal_p (@1, @2)) > (if (TYPE_UNSIGNED (type)) > (absu:type @2) > (abs @2))))) > > However with this, I am seeing: Note that the @2 for these are unsigned. I see, so we rely on @1 being the signed equivalent of @2 which might or might not be signed. I guess that indeed we want @2 here. > Tests that now fail, but worked before (4 tests): > > gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if " > gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 2 > gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if "gcc: > gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 "ABSU_EXPR <" 2 > > > unsigned f2(int A) > { > unsigned t = A; > // A >= 0? A : -A same as abs (A) > if (A >= 0) return t; > return -t; > } > > unsigned abs_with_convert0 (int x) > { > unsigned int y = x; > > if (x < 0) > y = -y; > > return y; > } > unsigned abs_with_convert1 (unsigned x) > { > int y = x; > > if (y < 0) > x = -x; > > return x; > } > > In the original pattern, we have check for !TYPE_UNSIGNED (TREE_TYPE(@0)) > and (absu:type @0) > > Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and (absu:type @1) in > the new pattern > > Thanks, > Kugan > > > > > Same comments apply to the 2nd pattern update. > > > > Richard. > > > >> Thanks, > >> Kugan > >> > >> gcc/ChangeLog: > >> > >> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A. > >> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * g++.dg/absvect.C: New test. > >> * gcc.dg/tree-ssa/absfloat16.c: New test. > >> > >> > >> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > >> > >> > >> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR > >> codes by instead looking whether the condition is vector or scalar. > >> Fallout > >> might be a bit too big, but it might be a thing to try. > >> > >> Richard. > >> > >>> Thanks, > >>> Andrew Pinski > >>> > >>>> > >>>> Thanks, > >>>> Kugan > >>>> > >>>>> > >>>>>> > >>>>>>> > >>>>>>> + (absu:type @1) > >>>>>>> + (abs @1))))) > >>>>>>> > >>>>>>> I think this should use @2 now. > >>>>>> I will change this. > >>>>>> > >>>>>> Thanks, > >>>>>> Kugan > >>>>>> > >>>>>>> > >>>>>>>> Thanks. > >>>>>>>> Kugan > >>>>>>>>> > >>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others). > >>>>>>>>> > >>>>>>>>> Other than that it looks good but I can't approve it. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Andrew Pinski > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > >>>>>>>>>> > >>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK > >>>>>>>>>> for trunk? > >>>>>>>>>> Thanks, > >>>>>>>>>> Kugan > >>>>>>>>>> > >>>>>>>>>> ________________________________ > >>>>>>>>>> From: Andrew Pinski <pins...@gmail.com> > >>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM > >>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > >>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; > >>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com> > >>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float > >>>>>>>>>> > >>>>>>>>>> External email: Use caution opening links or attachments > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah > >>>>>>>>>> <kvivekana...@nvidia.com> wrote: > >>>>>>>>>>> > >>>>>>>>>>> This patch extends abs detection in matched for half float. > >>>>>>>>>>> > >>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK > >>>>>>>>>>> for trunk? > >>>>>>>>>> > >>>>>>>>>> This is basically this pattern: > >>>>>>>>>> ``` > >>>>>>>>>> /* A >=/> 0 ? A : -A same as abs (A) */ > >>>>>>>>>> (for cmp (ge gt) > >>>>>>>>>> (simplify > >>>>>>>>>> (cnd (cmp @0 zerop) @1 (negate @1)) > >>>>>>>>>> (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0)) > >>>>>>>>>> && !TYPE_UNSIGNED (TREE_TYPE(@0)) > >>>>>>>>>> && bitwise_equal_p (@0, @1)) > >>>>>>>>>> (if (TYPE_UNSIGNED (type)) > >>>>>>>>>> (absu:type @0) > >>>>>>>>>> (abs @0))))) > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> except extended to handle an optional convert. Why didn't you just > >>>>>>>>>> extend the above pattern to handle the convert instead? Also I > >>>>>>>>>> think > >>>>>>>>>> you have an issue with unsigned types with the comparison. > >>>>>>>>>> Also you should extend the -abs(A) pattern right below it in a > >>>>>>>>>> similar fashion. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Andrew Pinski > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> gcc/ChangeLog: > >>>>>>>>>>> > >>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A into > >>>>>>>>>>> abs (A) for half float. > >>>>>>>>>>> > >>>>>>>>>>> gcc/testsuite/ChangeLog: > >>>>>>>>>>> > >>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test. > >>>>>>>>>>> > >>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> > >