On Tue, 27 Oct 2015, Kyrill Tkachov wrote:
Hi Marc,
On 30/08/15 08:57, Marc Glisse wrote:
Hello,
just trying to shrink fold-const.c a bit more.
initializer_zerop is close to what I was looking for with zerop, but I
wasn't sure if it would be safe (it accepts some CONSTRUCTOR and
STRING_CST). At some point I tried using sign_bit_p, but using the return
of that function in the simplification confused the machinery too much. I
added an "overload" of element_precision like the one in element_mode, for
convenience.
Bootstrap+testsuite on ppc64le-redhat-linux.
2015-08-31 Marc Glisse <marc.gli...@inria.fr>
gcc/
* tree.h (zerop): New function.
* tree.c (zerop): Likewise.
(element_precision): Handle expressions.
* match.pd (define_predicates): Add zerop.
(x <= +Inf): Fix comment.
(abs (x) == 0, A & C == C, A & C != 0): Converted from ...
* fold-const.c (fold_binary_loc): ... here. Remove.
gcc/testsuite/
* gcc.dg/tree-ssa/cmp-1.c: New file.
+/* If we have (A & C) != 0 where C is the sign bit of A, convert
+ this into A < 0. Similarly for (A & C) == 0 into A >= 0. */
+(for cmp (eq ne)
+ ncmp (ge lt)
+ (simplify
+ (cmp (bit_and (convert?@2 @0) integer_pow2p@1) integer_zerop)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+ && (TYPE_PRECISION (TREE_TYPE (@0))
+ == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0))))
+ && element_precision (@2) >= element_precision (@0)
+ && wi::only_sign_bit_p (@1, element_precision (@0)))
This condition is a bit strict when @0 is signed, for an int32_t i, (i &
(5LL << 42)) != 0 would work as well thanks to sign extension.
+ (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
+ (ncmp (convert:stype @0) { build_zero_cst (stype); })))))
+
With this patch and this pattern pattern in particular I've seen some code
quality regressions on aarch64.
I'm still trying to reduce a testcase to demonstrate the issue but it seems
to involve
intorucing extra conversions from unsigned to signed values. If I gate this
pattern on
!TYPE_UNSIGNED (TREE_TYPE (@0)) the codegen seems to improve.
Any thoughts?
Hmm, my first thoughts would be:
* conversion from unsigned to signed is a NOP,
* if a & signbit != 0 is faster to compute than a < 0, that's how a < 0
should be expanded by the target,
* so the problem is probably something else, maybe the bit_and combined
better with another operation, or the cast obfuscates things enough to
confuse a later pass. Or maybe something related to @2.
An example would help understand what we are talking about...
--
Marc Glisse