On Thu, Oct 29, 2015 at 5:34 AM, Hurugalawadi, Naveen <naveen.hurugalaw...@caviumnetworks.com> wrote: > Hi, > > Please find attached the patch that moves some multiply optimizations > from fold-const using simplify and match. > > Please review the patch and let me know if any modifications are required. > > Tested the patch on X86. > > Observing following failures:- > >>> FAIL: gcc.dg/fold-plusmult.c scan-tree-dump-times original "<a> \\* 4" 2 > > Should the testcase be changed to suit current pattern? > >>> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0 >>> FAIL: gcc.dg/tree-ssa/vrp59.c scan-tree-dump-not vrp1 " & 3;" > > Its due to the following pattern. Pattern seems to be right. > Fold X & (X ^ Y) as X & ~Y > The test PASSes when we have the result as ~X & Y in some > of the following combinations which is wrong. > (bit_and (convert? (bit_xor:c @0 @1) (convert? @0) ))
Please do not drop A - B -> A + (-B) from fold-const as match.pd doesn't implement all of fold-const.c negate_expr_p support. +/* Fold X & (X ^ Y) as X & ~Y. */ +(simplify + (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1))) + (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) + (convert (bit_and @0 (bit_not @1))))) Ok, so the VRP regression is because we convert _8 = x_2(D) ^ 1; _4 = (_Bool) _8; _5 = (int) _4; to _7 = ~x_2(D); _9 = _7 & 1; via the new pattern and /* A truncation to an unsigned type (a zero-extension) should be canonicalized as bitwise and of a mask. */ (if (final_int && inter_int && inside_int && final_prec == inside_prec && final_prec > inter_prec && inter_unsignedp) (convert (bit_and @0 { wide_int_to_tree (inside_type, wi::mask (inter_prec, false, TYPE_PRECISION (inside_type))); }))) Previously VRP ended up with <bb 3>: _8 = x_2(D) ^ 1; and now we have _7 = ~x_2(D); _9 = _7 & 1; which is more expensive. This means that we miss a (bit_and (bit_not @0) INTEGER_CST@1) -> (bit_xor @0 @1) pattern that applies when VRP knows the range of x_2(D) (all masked bits are know to zero). +/* Convert X * -C into -X * C. */ +(simplify + (mult:c (convert? negate_expr_p@0) INTEGER_CST@1) + (if (tree_int_cst_sgn (@1) == -1) + (with { tree tem = const_unop (NEGATE_EXPR, type, @1); } + (if (!TREE_OVERFLOW (tem) && wi::ne_p (tem, @1) + && tree_nop_conversion_p (type, TREE_TYPE (@0))) + (mult (convert (negate @0)) @1))))) as said above match.pd negate_expr_p doesn't capture everything fold-const.c does so moving the above isn't a good idea. +/* Convert (A + A) * C -> A * 2 * C. */ +(simplify + (mult:c (convert? (plus @0 @0)) (convert? @1)) + (if (tree_nop_conversion_p (TREE_TYPE (@0), type)) + (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1))))) +(simplify + (mult:c (convert? (plus @0 @0)) INTEGER_CST@1) + (if (tree_nop_conversion_p (TREE_TYPE (@0), type)) + (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1))))) fold-const.c only handles constant C, so we only need to 2nd pattern. Also the :c on the mult in that is not needed due to canonicalization rules. Please build the result of the inner multiplication directly. I think the fold-const.c code has an overflow issue when the outer multiplication is signed and the inner addition unsigned. (signed)((unsigned)INT_MAX + (unsigned)INT_MAX)*2 is valid but INT_MAX * 4 is not as it overflows. So I think we should _not_ allow nop conversions here (it's fine if all ops are signed or unsigned). Richard. > Thanks, > Naveen > > ChangeLog > > 2015-10-29 Naveen H.S <naveen.hurugalaw...@caviumnetworks.com> > > * fold-const.c (fold_binary_loc) : Remove A - B -> A + (-B) if B > is easily negatable as its already present. > Move x * -C into -x * C if x is easily negatable to match.pd. > Move (A + A) * C -> A * 2 * C to match.pd. > Move Fold (X ^ Y) & Y as ~X & Y to match.pd. > Move Fold (X ^ Y) & X as ~Y & X to match.pd. > Move Fold X & (X ^ Y) as X & ~Y to match.pd. > Move Fold X & (Y ^ X) as ~Y & X to match.pd. > > * match.pd (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1))): > New simplifier. > (mult:c (convert? negate_expr_p@0) INTEGER_CST@1): New simplifier. > (mult:c (convert? (plus @0 @0)) (convert? @1)): New simplifier. > (mult:c (convert? (plus @0 @0)) INTEGER_CST@1): New simplifier.