On 27/10/15 15:57, Marc Glisse wrote:
On Tue, 27 Oct 2015, Kyrill Tkachov wrote:

Hi Marc,

On 30/08/15 08:57, Marc Glisse wrote:
Hello,

just trying to shrink fold-const.c a bit more.

initializer_zerop is close to what I was looking for with zerop, but I
wasn't sure if it would be safe (it accepts some CONSTRUCTOR and
STRING_CST). At some point I tried using sign_bit_p, but using the return
of that function in the simplification confused the machinery too much. I
added an "overload" of element_precision like the one in element_mode, for
convenience.

Bootstrap+testsuite on ppc64le-redhat-linux.


2015-08-31  Marc Glisse  <marc.gli...@inria.fr>

gcc/
      * tree.h (zerop): New function.
      * tree.c (zerop): Likewise.
      (element_precision): Handle expressions.
      * match.pd (define_predicates): Add zerop.
      (x <= +Inf): Fix comment.
      (abs (x) == 0, A & C == C, A & C != 0): Converted from ...
      * fold-const.c (fold_binary_loc): ... here. Remove.

gcc/testsuite/
      * gcc.dg/tree-ssa/cmp-1.c: New file.


+/* If we have (A & C) != 0 where C is the sign bit of A, convert
+   this into A < 0.  Similarly for (A & C) == 0 into A >= 0.  */
+(for cmp (eq ne)
+     ncmp (ge lt)
+ (simplify
+  (cmp (bit_and (convert?@2 @0) integer_pow2p@1) integer_zerop)
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+       && (TYPE_PRECISION (TREE_TYPE (@0))
+       == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0))))
+       && element_precision (@2) >= element_precision (@0)
+       && wi::only_sign_bit_p (@1, element_precision (@0)))

This condition is a bit strict when @0 is signed, for an int32_t i, (i & (5LL 
<< 42)) != 0 would work as well thanks to sign extension.

+   (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
+    (ncmp (convert:stype @0) { build_zero_cst (stype); })))))
+

With this patch and this pattern pattern in particular I've seen some code 
quality regressions on aarch64.
I'm still trying to reduce a testcase to demonstrate the issue but it seems to 
involve
intorucing extra conversions from unsigned to signed values. If I gate this 
pattern on
!TYPE_UNSIGNED (TREE_TYPE (@0)) the codegen seems to improve.

Any thoughts?

Hmm, my first thoughts would be:
* conversion from unsigned to signed is a NOP,
* if a & signbit != 0 is faster to compute than a < 0, that's how a < 0 should 
be expanded by the target,
* so the problem is probably something else, maybe the bit_and combined better 
with another operation, or the cast obfuscates things enough to confuse a later 
pass. Or maybe something related to @2.


Thanks,
So here the types are shorts and unsigned shorts. On aarch64 these are HImode 
values and there's no direct arithmetic
operations on them, so they have to be extended to SImode and truncated back.

An example would help understand what we are talking about...

Working on it. I'm trying to introduce gcc_unreachable calls into the compiler 
when the bad situation happens
and reducing the original file, but I think I'm not capturing the conditions 
that trigger this behaviour
exactly right :(

Kyrill


Reply via email to