https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71762
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- For a smaller C testcase the init-regs pass "saves" us here: static _Bool foo (_Bool a, _Bool b) { int x = a && ! b; return x != 0; } int y = 1; int main() { _Bool x; if (foo (x, y)) __builtin_abort (); return 0; } but we abort () with -fdisable-rtl-init-regs (and we know that pass is not conservative given the original testcase). We can fix that case up at RTL expansion time. We then generate main: .LFB1: .cfi_startproc movl y(%rip), %edx testl %edx, %edx setne %dl andl $1, %eax cmpb %al, %dl jb .L7 compared to main: .LFB1: .cfi_startproc movl y(%rip), %eax testl %eax, %eax setne %al cmpb %dl, %al jb .L7 and compared to main: .LFB1: .cfi_startproc movl y(%rip), %edx testl %edx, %edx jne .L2 testb $1, %al jne .L13 for without the pattern (but the bitfield reduction fix for default-defs still in). The following case (or loads from _Bool memory) show we'd have to apply bitfield reduction to all loads as well :/ static _Bool foo (_Bool a, _Bool b) { int x = a && ! b; return x != 0; } int y = 1; int main() { register _Bool x __asm__("%rsi"); if (foo (x, y)) __builtin_abort (); return 0; } we can't initialize x -- but we could in theory reduce to bitfield precision at the point of the "load" (asm regs are memory on gimple): ;; x.1_3 = x; (insn 5 4 0 (set (reg:QI 89 [ x.1_3 ]) (reg/v:QI 4 si [ x ])) "t.c":12 -1 (nil)) Thus likewise for static _Bool foo (_Bool a, _Bool b) { int x = a && ! b; return x != 0; } int y = 1; int main() { _Bool x[128]; if (foo (x[1], y)) __builtin_abort (); return 0; } which we expand to ;; _3 = x[1]; ;; if (_2 < _3) ... (insn 7 6 8 (set (reg:CC 17 flags) (compare:CC (reg:QI 92) (mem/c:QI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -127 [0xffffffffffffff81])) [2 x+1 S1 A8]))) "t.c":12 -1 (nil)) subq $136, %rsp .cfi_def_cfa_offset 144 movl y(%rip), %eax testl %eax, %eax setne %al cmpb 1(%rsp), %al For reference the patch for the undefined SSA case: Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 242004) +++ gcc/expr.c (working copy) @@ -61,6 +61,8 @@ along with GCC; see the file COPYING3. #include "tree-chkp.h" #include "rtl-chkp.h" #include "ccmp.h" +#include "tree-dfa.h" +#include "tree-ssa.h" /* If this is nonzero, we do not bother generating VOLATILE @@ -9734,6 +9736,17 @@ expand_expr_real_1 (tree exp, rtx target ssa_name = exp; decl_rtl = get_rtx_for_ssa_name (ssa_name); exp = SSA_NAME_VAR (ssa_name); + /* If we didn't have the chance to reduce to bitfield precision + at the definition site do so here. For signed undefined + values nothing can be assumed for the upper bits so we do not + need to do anything here. Note that we do this at all + use sites because that allows easier combining. Note we + can't easily do it once at function entry w/o changing + coalescing. */ + if (reduce_bit_field + && TYPE_UNSIGNED (type) + && ssa_undefined_value_p (ssa_name, false)) + decl_rtl = reduce_to_bit_field_precision (decl_rtl, NULL_RTX, type); goto expand_decl_rtl; case PARM_DECL: while the undef SSA case is reasonable to handle (and has a low chance to pessimize code) applying bitfield reduction at all loads of bit-precision values is going to hurt (ok, in practice that's going to be loads from _Bool arrays and _Bool register asm vars only). Mid-term it might be good to lower bitfield precision stuff on GIMPLE (similar as to how we want to lower PROMOTE_REGS). Thus to conclude -- I'm going to test removal of the patterns. A similar transform on the RTL side might be still profitable in case known-zero-bits indicates both participating regs only have one (not-sign) bit possibly set (and we can do that on GIMPLE for TYPE_PRECISION == TYPE_MODE as well).