Hi, Generally we don't try to fold (long)(A-B) into (long)A - (long)B because it results in more operations. On the other hand, this fold is wanted when we want to explore as many canonical opportunities as possible. Tree affine is definitely such a place. This patch supports this in tree_to_aff_combination, so it can produce canonical affines rather than stupid expressions like "&arr + (sizetype) (t_4(D) + t_4(D)) * 4 - (sizetype)t_4(D) * 8".
Bootstrap and test on x86_64 and aarch64 along with other patches. Is it OK? 2015-08-31 Bin Cheng <bin.ch...@arm.com> * tree-affine.c (tree_to_aff_combination): Try to fold (long)(A-B) by adding CASE_CONVERT support.
Index: gcc/tree-affine.c =================================================================== --- gcc/tree-affine.c (revision 227163) +++ gcc/tree-affine.c (working copy) @@ -377,6 +377,37 @@ tree_to_aff_combination (tree expr, tree type, aff aff_combination_add (comb, &tmp); return; + CASE_CONVERT: + { + tree outer_type = TREE_TYPE (expr); + tree inner = TREE_OPERAND (expr, 0); + tree inner_type = TREE_TYPE (inner); + + /* Fold will not canonicalize (long)(A-B) to (long)A - (long)B + because it has more operations. We perform this fold in + affine to explore more canonicalization opportunities. */ + if ((TREE_CODE (inner) == PLUS_EXPR + || TREE_CODE (inner) == MINUS_EXPR) + && TREE_CODE (inner_type) == INTEGER_TYPE + && TREE_CODE (outer_type) == INTEGER_TYPE + && TYPE_PRECISION (outer_type) > TYPE_PRECISION (inner_type) + && TYPE_OVERFLOW_UNDEFINED (inner_type)) + { + tree op0 = TREE_OPERAND (inner, 0); + tree op1 = TREE_OPERAND (inner, 1); + + tree_to_aff_combination (fold_build2 (TREE_CODE (inner), + outer_type, + fold_convert (outer_type, + op0), + fold_convert (outer_type, + op1)), + type, comb); + return; + } + } + break; + default: break; }