On 6/12/23 01:57, Stefan Schulze Frielinghaus via Gcc-patches wrote:
Comparisons between memory and constants might be done in a smaller mode
resulting in smaller constants which might finally end up as immediates
instead of in the literal pool.

For example, on s390x a non-symmetric comparison like
   x <= 0x3fffffffffffffff
results in the constant being spilled to the literal pool and an 8 byte
memory comparison is emitted.  Ideally, an equivalent comparison
   x0 <= 0x3f
where x0 is the most significant byte of x, is emitted where the
constant is smaller and more likely to materialize as an immediate.

Similarly, comparisons of the form
   x >= 0x4000000000000000
can be shortened into x0 >= 0x40.

I'm not entirely sure whether combine is the right place to implement
something like this.  In my first try I implemented it in
TARGET_CANONICALIZE_COMPARISON but then thought other targets might
profit from it, too.  simplify_context::simplify_relational_operation_1
seems to be the wrong place since code/mode may change.  Any opinions?

gcc/ChangeLog:

        * combine.cc (simplify_compare_const): Narrow comparison of
        memory and constant.
        (try_combine): Adapt new function signature.
        (simplify_comparison): Adapt new function signature.

gcc/testsuite/ChangeLog:

        * gcc.target/s390/cmp-mem-const-1.c: New test.
        * gcc.target/s390/cmp-mem-const-2.c: New test.
This does seem more general than we'd want to do in the canonicalization hook. So thanks for going the extra mile and doing a generic implementation.




@@ -11987,6 +11988,79 @@ simplify_compare_const (enum rtx_code code, 
machine_mode mode,
        break;
      }
+ /* Narrow non-symmetric comparison of memory and constant as e.g.
+     x0...x7 <= 0x3fffffffffffffff into x0 <= 0x3f where x0 is the most
+     significant byte.  Likewise, transform x0...x7 >= 0x4000000000000000 into
+     x0 >= 0x40.  */
+  if ((code == LEU || code == LTU || code == GEU || code == GTU)
+      && is_a <scalar_int_mode> (GET_MODE (op0), &int_mode)
+      && MEM_P (op0)
+      && !MEM_VOLATILE_P (op0)
+      && (unsigned HOST_WIDE_INT)const_op > 0xff)
+    {
+      unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT)const_op;
+      enum rtx_code adjusted_code = code;
+
+      /* If the least significant bit is already zero, then adjust the
+        comparison in the hope that we hit cases like
+          op0  <= 0x3ffffdfffffffffe
+        where the adjusted comparison
+          op0  <  0x3ffffdffffffffff
+        can be shortened into
+          op0' <  0x3ffffd.  */
+      if (code == LEU && (n & 1) == 0)
+       {
+         ++n;
+         adjusted_code = LTU;
+       }
+      /* or e.g. op0 < 0x4020000000000000  */
+      else if (code == LTU && (n & 1) == 0)
+       {
+         --n;
+         adjusted_code = LEU;
+       }
+      /* or op0 >= 0x4000000000000001  */
+      else if (code == GEU && (n & 1) == 1)
+       {
+         --n;
+         adjusted_code = GTU;
+       }
+      /* or op0 > 0x3fffffffffffffff.  */
+      else if (code == GTU && (n & 1) == 1)
+       {
+         ++n;
+         adjusted_code = GEU;
+       }
+
+      scalar_int_mode narrow_mode_iter;
+      bool lower_p = code == LEU || code == LTU;
+      bool greater_p = !lower_p;
+      FOR_EACH_MODE_UNTIL (narrow_mode_iter, int_mode)
+       {
+         unsigned nbits = GET_MODE_PRECISION (int_mode)
+                         - GET_MODE_PRECISION (narrow_mode_iter);
+         unsigned HOST_WIDE_INT mask = (HOST_WIDE_INT_1U << nbits) - 1;
+         unsigned HOST_WIDE_INT lower_bits = n & mask;
+         if ((lower_p && lower_bits == mask)
+             || (greater_p && lower_bits == 0))
+           {
+             n >>= nbits;
+             break;
+           }
+       }
+
+      if (narrow_mode_iter < int_mode)
+       {
+         poly_int64 offset = BYTES_BIG_ENDIAN
+                               ? 0
+                               : GET_MODE_SIZE (int_mode)
+                                 - GET_MODE_SIZE (narrow_mode_iter);
Go ahead and add some parenthesis here. I'd add one pair around the whole RHS of that assignment. The '?' and ':' would line up under the 'B' in that case. Similarly add them around the false arm of the ternary. The '-' will line up under the 'G'.

Going to trust you got the little endian adjustment correct here ;-)


/* Compute some predicates to simplify code below. */
diff --git a/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c 
b/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c
new file mode 100644
index 00000000000..b90c2a8c224
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c
@@ -0,0 +1,99 @@
+/* { dg-do compile { target { lp64 } } } */
+/* { dg-options "-O1 -march=z13 -mzarch" } */
+/* { dg-final { scan-assembler-not {\tclc\t} } } */
+
+int
+ge_8b (unsigned long *x)
+{
+  return *x >= 0x4000000000000000;
+}
Would it be possible to add some debugging output in simplify_compare_const so that you could search for that debugging output and make these tests generic?

Jeff

Reply via email to