Thanks, I have made the changes to the patch. Also can someone please apply it for me. I do not have commit access.
2017-10-10 Sudakshina Das <sudi....@arm.com> PR middle-end/80131 * match.pd: Simplify 1 << (C - x) where C = precision (x) - 1. 2017-10-10 Sudakshina Das <sudi....@arm.com> PR middle-end/80131 * testsuite/gcc.dg/pr80131-1.c: New Test. With regards to the existing missed optimizations needed to the x86 RTL expansion, I think the discussions can take place on the bug report that I created and maybe someone will pick it up. Thanks Sudi From: Wilco Dijkstra Sent: Monday, October 9, 2017 2:02 PM To: Richard Biener; Sudi Das Cc: Jakub Jelinek; GCC Patches; nd; Richard Earnshaw; James Greenhalgh Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x) Richard Biener wrote: > I think the patch is ok with these changes but obviously we should try > to address > the code-generation issue on x86 at RTL expansion time. They are sort-of > existing missing optimizations. Note the only x64 specific issue is the backend expansion of 64-bit immediates which could be improved like I suggested. However what we're really missing is a generic optimization pass that tries to simplify immediates using accurate target costs. In eg. x >= C or x > C we can use either C or C-1 - on many targets one option may be a single instruction, while the other might take 2 or even 3. Wilco
diff --git a/gcc/match.pd b/gcc/match.pd index e58a65a..7a25a1b 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -600,6 +600,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && tree_nop_conversion_p (type, TREE_TYPE (@1))) (lshift @0 @2))) +/* Fold (1 << (C - x)) where C = precision(type) - 1 + into ((1 << C) >> x). */ +(simplify + (lshift integer_onep@0 (minus@1 INTEGER_CST@2 @3)) + (if (INTEGRAL_TYPE_P (type) + && wi::eq_p (@2, TYPE_PRECISION (type) - 1) + && single_use (@1)) + (if (TYPE_UNSIGNED (type)) + (rshift (lshift @0 @2) @3) + (with + { tree utype = unsigned_type_for (type); } + (convert (rshift (lshift (convert:utype @0) @2) @3)))))) + /* Fold (C1/X)*C2 into (C1*C2)/X. */ (simplify (mult (rdiv@3 REAL_CST@0 @1) REAL_CST@2) diff --git a/gcc/testsuite/gcc.dg/pr80131-1.c b/gcc/testsuite/gcc.dg/pr80131-1.c new file mode 100644 index 0000000..0bfe1f4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr80131-1.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target int32plus } */ +/* { dg-options "-fdump-tree-gimple" } */ + +/* Checks the simplification of: + 1 << (C - x) to (1 << C) >> x, where C = precision (type) - 1 + f1 is not simplified but f2, f3 and f4 are. */ + +__INT64_TYPE__ f1 (__INT64_TYPE__ i) +{ + return (__INT64_TYPE__)1 << (31 - i); +} + +__INT64_TYPE__ f2 (__INT64_TYPE__ i) +{ + return (__INT64_TYPE__)1 << (63 - i); +} + +__UINT64_TYPE__ f3 (__INT64_TYPE__ i) +{ + return (__UINT64_TYPE__)1 << (63 - i); +} + +__INT32_TYPE__ f4 (__INT32_TYPE__ i) +{ + return (__INT32_TYPE__)1 << (31 - i); +} + +/* { dg-final { scan-tree-dump-times "= 31 -" 1 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "9223372036854775808 >>" 2 "gimple" } } */ +/* { dg-final { scan-tree-dump "2147483648 >>" "gimple" } } */