https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71762

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
For a smaller C testcase the init-regs pass "saves" us here:

static _Bool
foo (_Bool a, _Bool b)
{
  int x = a && ! b;
  return x != 0;
}

int y = 1;
int main()
{
  _Bool x;
  if (foo (x, y))
    __builtin_abort ();
  return 0;
}

but we abort () with -fdisable-rtl-init-regs (and we know that pass is
not conservative given the original testcase).  We can fix that case up
at RTL expansion time.  We then generate

main:
.LFB1:
        .cfi_startproc
        movl    y(%rip), %edx
        testl   %edx, %edx
        setne   %dl
        andl    $1, %eax
        cmpb    %al, %dl
        jb      .L7

compared to

main:
.LFB1:
        .cfi_startproc
        movl    y(%rip), %eax
        testl   %eax, %eax
        setne   %al
        cmpb    %dl, %al
        jb      .L7

and compared to

main:
.LFB1:
        .cfi_startproc
        movl    y(%rip), %edx
        testl   %edx, %edx
        jne     .L2
        testb   $1, %al
        jne     .L13

for without the pattern (but the bitfield reduction fix for default-defs
still in).

The following case (or loads from _Bool memory) show we'd have to apply
bitfield reduction to all loads as well :/

static _Bool
foo (_Bool a, _Bool b)
{
  int x = a && ! b;
  return x != 0;
}

int y = 1;
int main()
{
  register _Bool x __asm__("%rsi");
  if (foo (x, y))
    __builtin_abort ();
  return 0;
}

we can't initialize x -- but we could in theory reduce to bitfield precision
at the point of the "load" (asm regs are memory on gimple):

;; x.1_3 = x;

(insn 5 4 0 (set (reg:QI 89 [ x.1_3 ])
        (reg/v:QI 4 si [ x ])) "t.c":12 -1
     (nil))

Thus likewise for

static _Bool
foo (_Bool a, _Bool b)
{
  int x = a && ! b;
  return x != 0;
}

int y = 1;
int main()
{
  _Bool x[128];
  if (foo (x[1], y))
    __builtin_abort ();
  return 0;
}

which we expand to

;; _3 = x[1];
;; if (_2 < _3)

...

(insn 7 6 8 (set (reg:CC 17 flags)
        (compare:CC (reg:QI 92)
            (mem/c:QI (plus:DI (reg/f:DI 82 virtual-stack-vars)
                    (const_int -127 [0xffffffffffffff81])) [2 x+1 S1 A8])))
"t.c":12 -1
     (nil))


        subq    $136, %rsp
        .cfi_def_cfa_offset 144
        movl    y(%rip), %eax
        testl   %eax, %eax
        setne   %al
        cmpb    1(%rsp), %al


For reference the patch for the undefined SSA case:

Index: gcc/expr.c
===================================================================
--- gcc/expr.c  (revision 242004)
+++ gcc/expr.c  (working copy)
@@ -61,6 +61,8 @@ along with GCC; see the file COPYING3.
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
 #include "ccmp.h"
+#include "tree-dfa.h"
+#include "tree-ssa.h"


 /* If this is nonzero, we do not bother generating VOLATILE
@@ -9734,6 +9736,17 @@ expand_expr_real_1 (tree exp, rtx target
       ssa_name = exp;
       decl_rtl = get_rtx_for_ssa_name (ssa_name);
       exp = SSA_NAME_VAR (ssa_name);
+      /* If we didn't have the chance to reduce to bitfield precision
+         at the definition site do so here.  For signed undefined
+        values nothing can be assumed for the upper bits so we do not
+        need to do anything here.  Note that we do this at all
+        use sites because that allows easier combining.  Note we
+        can't easily do it once at function entry w/o changing
+        coalescing.  */
+      if (reduce_bit_field
+         && TYPE_UNSIGNED (type)
+         && ssa_undefined_value_p (ssa_name, false))
+       decl_rtl = reduce_to_bit_field_precision (decl_rtl, NULL_RTX, type);
       goto expand_decl_rtl;

     case PARM_DECL:


while the undef SSA case is reasonable to handle (and has a low chance to
pessimize code) applying bitfield reduction at all loads of bit-precision
values is going to hurt (ok, in practice that's going to be loads from
_Bool arrays and _Bool register asm vars only).

Mid-term it might be good to lower bitfield precision stuff on GIMPLE
(similar as to how we want to lower PROMOTE_REGS).

Thus to conclude -- I'm going to test removal of the patterns.  A similar
transform on the RTL side might be still profitable in case known-zero-bits
indicates both participating regs only have one (not-sign) bit possibly set
(and we can do that on GIMPLE for TYPE_PRECISION == TYPE_MODE as well).

Reply via email to