https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- So more details. The i2 insn is: (insn 16 15 17 2 (set (zero_extract:DI (subreg:DI (reg/v:TI 103 [ f ]) 0) (const_int 8 [0x8]) (const_int 16 [0x10])) (subreg:DI (reg:SI 96 [ _7 ]) 0)) "pr99830.c":7:3 744 {*insv_regdi} (expr_list:REG_DEAD (reg:SI 96 [ _7 ]) (nil))) and can_combine_p makes through the expand_field_assignment call i2src (ior:TI (and:TI (reg/v:TI 103 [ f ]) (const_int -16711681 [0xffffffffff00ffff])) (ashift:TI (and:TI (clobber:TI (const_int 0 [0])) (const_int 255 [0xff])) (const_int 16 [0x10]))) out of this. i3 is (insn 20 19 21 2 (set (reg:SI 108 [ f ]) (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9 114 {*zero_extendqisi2_aarch64} (expr_list:REG_DEAD (reg/v:TI 103 [ f ]) (nil))) so, I think it is perfectly fine that when i3 only cares about the low 8 bits of pseudo 103 that it figures out that it is just the low 8 bits of the original pseudo 103, not ored with anything else, because (unsigned char) ((whatever & 255) << 16) is 0. So, I don't see anything wrong on i2 -> i3 combination turning it into (insn 20 19 21 2 (set (reg:SI 108 [ f ]) (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9 114 {*zero_extendqisi2_aarch64} (nil)) In particular, it is combine_simplify_rtx that is called on: (zero_extend:SI (subreg:QI (ior:TI (and:TI (reg/v:TI 103 [ f ]) (const_int -16711681 [0xffffffffff00ffff])) (ashift:TI (and:TI (clobber:TI (const_int 0 [0])) (const_int 255 [0xff])) (const_int 16 [0x10]))) 0)) which simplifies it into (and:SI (subreg:SI (reg/v:TI 103 [ f ]) 0) (const_int 255 [0xff])) But, there is also (debug_insn 18 17 19 2 (var_location:HI c (subreg:HI (ashiftrt:SI (sign_extend:SI (subreg:HI (reg/v:SI 100 [ c ]) 0)) (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) 0)) "pr99830.c":8:5 -1 (nil)) into which that try_combine propagate_for_debug the (reg/v:TI 103 [ f ]) i2dest and replace it with the i2src mentioned above. In this case it is similarly used in a (subreg:QI ...) so in theory it could also optimize into just the low bits of older r103. Except that propagate_for_debug uses only simplify-rtx.c APIs and doesn't have combine_simplify_rtx for it. But in theory it could also be used in other contexts in the debug insn too.