Re: [PATCH] Take known zero bits into account when checking extraction.

Jeff Law Wed, 27 Apr 2016 21:25:26 -0700

On 04/27/2016 02:20 AM, Dominik Vogt wrote:

The attached patch is a result of discussing an S/390 issue with
"and with complement" in some cases.


  https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html
  https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html

Combine would merge a ZERO_EXTEND and a SET taking the known zero
bits into account, resulting in an AND.  Later on,
make_compound_operation() fails to replace that with a ZERO_EXTEND
which we get for free on S/390 but leaves the AND, eventually
resulting in two consecutive AND instructions.

The current code in make_compound_operation() that detects
opportunities for ZERO_EXTEND does not work here because it does
not take the known zero bits into account:

      /* If the constant is one less than a power of two, this might be
         representable by an extraction even if no shift is present.
         If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
         we are in a COMPARE.  */
      else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0)
        new_rtx = make_extraction (mode,
                               make_compound_operation (XEXP (x, 0),
                                                        next_code),
                               0, NULL_RTX, i, 1, 0, in_code == COMPARE);

An attempt to use the zero bits in the above conditions resulted
in many situations that generated worse code, so the patch tries
to fix this in a more conservative way.  While the effect is
completely positive on S/390, this will very likely have
unforeseeable consequences on other targets.

Bootstrapped and regression tested on s390 and s390x only at the
moment.

Ciao

Dominik ^_^  ^_^

-- Dominik Vogt IBM Germany


0001-ChangeLog


gcc/ChangeLog

        * combine.c (make_compound_operation): Take known zero bits into
        account when checking for possible zero_extend.

I'd strongly recommend writing some tests for this. Extra credit ifthey can be run on an x86 target which gets more testing than s390.


If I go back to our original discussion, we have this going into combine:

(insn 6 3 7 2 (parallel [
            (set (reg:SI 64)
                (and:SI (mem:SI (reg/v/f:DI 63 [ a ]) [1 *a_2(D)+0 S4 A32])
                    (const_int -65521 [0xffffffffffff000f])))
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:21 1481 {*andsi3_zarch}
     (expr_list:REG_DEAD (reg/v/f:DI 63 [ a ])
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
            (nil))))
(insn 7 6 12 2 (set (reg:DI 65)

(zero_extend:DI (reg:SI 64))) andc-immediate.c:21 1207{*zero_extendsidi2}

     (expr_list:REG_DEAD (reg:SI 64)
        (nil)))
(insn 12 7 13 2 (set (reg/i:DI 2 %r2)
        (reg:DI 65)) andc-immediate.c:22 1073 {*movdi_64}
     (expr_list:REG_DEAD (reg:DI 65)
        (nil)))

Which combine turns into:

(insn 6 3 7 2 (parallel [
            (set (reg:SI 64)
                (and:SI (mem:SI (reg:DI 2 %r2 [ a ]) [1 *a_2(D)+0 S4 A32])
                    (const_int -65521 [0xffffffffffff000f])))
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:21 1481 {*andsi3_zarch}
     (expr_list:REG_DEAD (reg:DI 2 %r2 [ a ])
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
            (nil))))
(insn 12 7 13 2 (parallel [
            (set (reg/i:DI 2 %r2)
                (and:DI (subreg:DI (reg:SI 64) 0)
                 ^^^
                    (const_int 4294901775 [0xffff000f])))
                                           ^^^^^^^^^^
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:22 1474 {*anddi3}
     (expr_list:REG_UNUSED (reg:CC 33 %cc)
        (expr_list:REG_DEAD (reg:SI 64)
            (nil))))

Instead you want insn 12 to use a zero-extend to extend (reg:SI 64) into(reg:DI 2)?


Can't you achieve this in this clause:

     /* If the constant is one less than a power of two, this might be
         representable by an extraction even if no shift is present.
         If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
         we are in a COMPARE.  */

You extract the constant via UINTVAL (XEXP (x, 1)), then munge it basedon nonzero_bits and pass the result to exact_log2?

Though I do like how you've conditionalized on the cost of the and vsthe cost of hte zero-extend. So maybe your approach is ultimatelybetter. Still curious your thoughts on doing it by just munging theconstant you pass off to exact_log2 in that earlier clause.

+      /* If the one operand is a paradoxical subreg of a register or memory and
+        the constant (limited to the smaller mode) has only zero bits where
+        the sub expression has known zero bits, this can be expressed as
+        a zero_extend.  */
+      else if (GET_CODE (XEXP (x, 0)) == SUBREG)
+       {
+         rtx sub;
+
+         sub = XEXP (XEXP (x, 0), 0);
+         machine_mode sub_mode = GET_MODE (sub);
+         if ((REG_P (sub) || MEM_P (sub))
+             && GET_MODE_PRECISION (sub_mode) < mode_width
+             && (UINTVAL (XEXP (x, 1))
+                 | (~nonzero_bits (sub, sub_mode) & GET_MODE_MASK (sub_mode))
+                 ) == GET_MODE_MASK (sub_mode))

I'd either factor something out or write a nested conditional to avoidthe horrid line wrapping here and hopefully make the conditions easierto read.


Jeff

Re: [PATCH] Take known zero bits into account when checking extraction.

Reply via email to