Thanks for the detailed review. I have uploaded patch v2 based on the review.
v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590604.html Changes since v1: 1. Add patterns for the cases CST / (T)x and (T)x / CST as well. Fix test regressions caused by those patterns. 2. Support signed integers except for the possible INT_MIN / -1 case. 3. Support cases where X and Y have different precisions. On Tue, 22 Feb 2022 at 15:54, Richard Biener <richard.guent...@gmail.com> wrote: > +/* (type)X / (type)Y -> (type)(X / Y) > + when the resulting type is at least precise as the original types > + and when all the types are unsigned integral types. */ > +(simplify > + (trunc_div (convert @0) (convert @1)) > + (if (INTEGRAL_TYPE_P (type) > + && INTEGRAL_TYPE_P (TREE_TYPE (@0)) > + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) > + && TYPE_UNSIGNED (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && TYPE_UNSIGNED (TREE_TYPE (@1)) > + && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)) > + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@0))) > + (convert (trunc_div @0 @1)))) > > since you are requiring the types of @0 and @1 to match it's easier to write > > && types_match (TREE_TYPE(@0), TREE_TYPE (@1)) > Thanks. In the new patch, TREE_TYPE (@0) and TREE_TYPE (@1) can now have different precisions, so I believe that I can't use types_match() anymore. > that allows you to elide checks on either @0 or @1. I suppose the transform > does not work for signed types because of the -INT_MIN / -1 overflow case? > It might be possible to use expr_not_equal_to (@0, -INT_MIN) || > expr_not_equal_to (@1, -1) > (correctly written, lookup the existing examples in match.pd for the X > % -Y transform) That's right. I have made changes to support signed types except for the INT_MIN / -1 case. > I'll note that as written the transform will not catch CST / (T)x or > (T)x / CST since > you'll not see conversions around constants. I'm not sure whether > using (convert[12]? ...) > or writing special patterns with INTEGER_CST operands is more convenient. > There is int_fits_type_p to check whether a constant will fit in a > type without truncation > or sign change. I see. I couldn't find an easy way to use (convert[12]? ...) in this case so I added 2 other patterns for the CST / (T)x and (T)x / CST cases. > When @0 and @1 do not have the same type there might still be a common type > that can be used and is smaller than 'type', it might be as simple as using > build_nonstandard_integer_type (MIN (@0-prec, @1-prec), 1 /*unsigned_p*/). Thanks for the tip. I've made a similar change, but using either TREE_TYPE (@0) or TREE_TYPE (@1) instead of build_nonstandard_integer_type(). > > In the past there have been attempts to more globally narrow operations using > a new pass rather than using individual patterns. So for more complicated > cases > that might be the way to go. There's now also the ISEL pass which does > pre-RTL expansion transforms that need some global context and for example > can look at SSA uses. I had wanted to do something similar to handle all integral trunc_divs, but I'm not sure where/how to start. It seems out of my league at this moment. I'll gladly explore it after this change though!
From f5f768b55f6cadcd9eba459561abfc1d7e28f94e Mon Sep 17 00:00:00 2001 From: Zhao Wei Liew <zhaoweil...@gmail.com> Date: Sat, 19 Feb 2022 16:28:38 +0800 Subject: [PATCH] tree-optimization: [PR103855] Fold (type)X / (type)Y This pattern converts (trunc_div (convert X) (convert Y)) to (convert (trunc_div X Y)) when: 1. type, X, and Y are all INTEGRAL_TYPES with the same signedness. 2. type has TYPE_PRECISION at least as large as those of X and Y. 3. the result of the division is guaranteed to fit in either the type of Y or the type of Y. This patch also adds corresponding patterns for CST / (type)Y and (type)X / CST. These patterns useful as wider divisions are typically more expensive. Bootstrapped and regression tested on x86_64-pc-linux-gnu. Signed-off-by: Zhao Wei Liew <zhaoweil...@gmail.com> PR tree-optimization/103855 gcc/ChangeLog: * match.pd: Add patterns for (type)X / (type)Y, CST / (type)Y, and (type)X / CST. gcc/testsuite/ChangeLog: * gcc.dg/ipa/pr91088.c: Fix a test regression. * gcc.dg/tree-ssa/divide-8.c: New test. --- gcc/match.pd | 57 ++++++++++++++++++++++++ gcc/testsuite/gcc.dg/ipa/pr91088.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/divide-8.c | 45 +++++++++++++++++++ 3 files changed, 104 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-8.c diff --git a/gcc/match.pd b/gcc/match.pd index 8b44b5cc92c..b0bfb94f506 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -687,6 +687,63 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) (convert (trunc_mod @0 @1)))) +/* (type)X / (type)Y -> (type)(X / Y) + when all types are integral and have the same sign and + the resulting type is at least precise as the original types, */ +(simplify + (trunc_div (convert @0) (convert @1)) + (if (INTEGRAL_TYPE_P (type) + && INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@0)) + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@1)) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0)) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@1))) + (with + { + tree stype = TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (TREE_TYPE (@1)) + ? TREE_TYPE (@0) : TREE_TYPE (@1); + } + (if (TYPE_UNSIGNED (type) + /* Avoid this transformation if X might be INT_MIN of the larger type or + Y might be -1 because the result might overflow. */ + || TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (stype) + || expr_not_equal_to (@0, wi::to_wide (TYPE_MIN_VALUE (TREE_TYPE (@0)))) + || expr_not_equal_to (@1, wi::minus_one (TYPE_PRECISION (TREE_TYPE (@1))))) + (convert (trunc_div (convert:stype @0) (convert:stype @1))))))) + +/* (type)X / CST -> (type)(X / CST) */ +(simplify + (trunc_div (convert @0) INTEGER_CST@1) + (if (INTEGRAL_TYPE_P (type) + && INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (type) + && int_fits_type_p (@1, TREE_TYPE (@0)) + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@0)) + && (TYPE_UNSIGNED (type) + /* Avoid this transformation if X might be MIN_VALUE or + Y might be -1 because the result might overflow. */ + || expr_not_equal_to (@0, wi::to_wide (TYPE_MIN_VALUE (TREE_TYPE (@0)))) + || wi::to_wide (@1) != -1)) + (with { tree stype = TREE_TYPE (@0); } + (convert (trunc_div @0 (convert:stype @1)))))) + +/* CST / (type)Y -> (type)(CST / Y) */ +(simplify + (trunc_div INTEGER_CST@0 (convert @1)) + (if (INTEGRAL_TYPE_P (type) + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_UNSIGNED (TREE_TYPE (@1)) == TYPE_UNSIGNED (type) + && int_fits_type_p (@0, TREE_TYPE (@1)) + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@1)) + && (TYPE_UNSIGNED (type) + /* Avoid this transformation if X might be MIN_VALUE or + Y might be -1 because the result might overflow. */ + || !tree_int_cst_equal (TYPE_MIN_VALUE (TREE_TYPE (@1)), @0) + || expr_not_equal_to (@1, wi::minus_one (TYPE_PRECISION (TREE_TYPE (@1)))))) + (with { tree stype = TREE_TYPE (@1); } + (convert (trunc_div (convert:stype @0) @1))))) + /* x * (1 + y / x) - y -> x - y % x */ (simplify (minus (mult:cs @0 (plus:s (trunc_div:s @1 @0) integer_onep)) @1) diff --git a/gcc/testsuite/gcc.dg/ipa/pr91088.c b/gcc/testsuite/gcc.dg/ipa/pr91088.c index cc146a88134..d7332974e37 100644 --- a/gcc/testsuite/gcc.dg/ipa/pr91088.c +++ b/gcc/testsuite/gcc.dg/ipa/pr91088.c @@ -60,7 +60,7 @@ int callee1 (struct A a) if ((a.f2 + 7) & 17) foo (); - if ((1300 / (short)a.f3) == 19) + if ((1300 / (a.f3 & 0xffff)) == 19) large_code; return 1; @@ -115,6 +115,6 @@ int caller () /* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee1" 1 "cp" } } */ /* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee2" 1 "cp" } } */ /* { dg-final { scan-ipa-dump-times "Creating a specialized node of callee3" 1 "cp" } } */ -/* { dg-final { scan-ipa-dump "op0\\\[offset: 32],\\(\\(short int\\) #\\),\\(\\(int\\) #\\),\\(1300 / #\\) == 19" "cp" } } */ +/* { dg-final { scan-ipa-dump "op0\\\[offset: 32],\\(# & 65535\\),\\(1300 / #\\) == 19" "cp" } } */ /* { dg-final { scan-ipa-dump "op0\\\[ref offset: 0],\\(# \\^ 1\\) <" "cp" } } */ /* { dg-final { scan-ipa-dump "op0,\\(# & 255\\),\\(1 - #\\),\\(# \\* 3\\),\\(27 % #\\) <" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c new file mode 100644 index 00000000000..e389e85a04f --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c @@ -0,0 +1,45 @@ +/* PR tree-optimization/103855 */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +unsigned int f1(unsigned int a, unsigned int b) { + unsigned long long all = a; + return all / b; +} + +unsigned int f2(unsigned char a, unsigned int b) { + unsigned long long all = a; + return all / b; +} + +unsigned int f3(unsigned int a, unsigned char b) { + unsigned long long all = a; + return all / b; +} + +int f4(char a, int b) { + long long all = a; + return all / b; +} + +unsigned int f5(unsigned int a) { + unsigned long long all = a; + return all / 123456789; +} + +unsigned int f6(unsigned int b) { + unsigned long long bll = b; + return 123456789 / bll; +} + +int f7(int a) { + long long all = a; + return all / 123456789; +} + +int f8(int b) { + long long bll = b; + return 123456789 / bll; +} + +/* { dg-final { scan-tree-dump-not "\\\(long long unsigned int\\\)" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "\\\(long long int\\\)" "optimized" } } */ -- 2.25.1