Something broke in the compiler to cause combine to incorrectly optimize: (insn 12 11 13 3 (set (reg:SI 604 [ D.6102 ]) (lshiftrt:SI (subreg/s/u:SI (reg/v:DI 601 [ x ]) 0) (reg:SI 602 [ D.6103 ]))) t.c:47 4436 {lshrsi3} (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ]) (nil))) (insn 13 12 14 3 (set (reg:SI 605) (and:SI (reg:SI 604 [ D.6102 ]) (const_int 1 [0x1]))) t.c:47 3658 {andsi3} (expr_list:REG_DEAD (reg:SI 604 [ D.6102 ]) (nil))) (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ]) (zero_extend:DI (reg:SI 605))) t.c:47 4616 {zero_extendsidi2} (expr_list:REG_DEAD (reg:SI 605) (nil)))
into: (insn 11 10 12 3 (set (reg:SI 602 [ D.6103 ]) (not:SI (subreg:SI (reg:DI 595 [ D.6102 ]) 0))) t.c:47 3732 {one_cmplsi2} (expr_list:REG_DEAD (reg:DI 595 [ D.6102 ]) (nil))) (note 12 11 13 3 NOTE_INSN_DELETED) (note 13 12 14 3 NOTE_INSN_DELETED) (insn 14 13 15 3 (set (reg:DI 599 [ D.6102 ]) (zero_extract:DI (reg/v:DI 601 [ x ]) (const_int 1 [0x1]) (reg:SI 602 [ D.6103 ]))) t.c:47 4668 {c2_extzvdi} (expr_list:REG_DEAD (reg:SI 602 [ D.6103 ]) (nil))) This shows up in: FAIL: gcc.c-torture/execute/builtin-bitops-1.c execution, -Og -g for me. diff --git a/gcc/combine.c b/gcc/combine.c index 708691f..c1f50ff 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7245,6 +7245,18 @@ make_extraction (enum machine_mode mode, rtx inner, HOST_WIDE_INT pos, extraction_mode = insn.field_mode; } + /* On a SHIFT_COUNT_TRUNCATED machine, we can't promote the mode of + the extract to a larger size on a variable extract, as previously + the position might have been optimized to change a bit of the + index of the starting bit that would have been ignored before, + but, with a larger mode, will then not be. If we wanted to do + this, we'd have to mask out those bits or prove that those bits + are 0. */ + if (SHIFT_COUNT_TRUNCATED + && pos_rtx + && GET_MODE_BITSIZE (extraction_mode) > GET_MODE_BITSIZE (mode)) + extraction_mode = mode; + /* Never narrow an object, since that might not be safe. */ if (mode != VOIDmode is sufficient to never widen variable extracts on SHIFT_COUNT_TRUNCATED machines. So, the question is, how did people expect this to work? I didn’t spot what changed recently to cause the bad code-gen. The optimization of sub into not is ok, despite how funny it looks, because is feeds into extract which we know by SHIFT_COUNT_TRUNCATED is safe. Is the patch a reasonable way to fix this?