https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So more details.  The i2 insn is:
(insn 16 15 17 2 (set (zero_extract:DI (subreg:DI (reg/v:TI 103 [ f ]) 0)
            (const_int 8 [0x8])
            (const_int 16 [0x10]))
        (subreg:DI (reg:SI 96 [ _7 ]) 0)) "pr99830.c":7:3 744 {*insv_regdi}
     (expr_list:REG_DEAD (reg:SI 96 [ _7 ])
        (nil)))
and can_combine_p makes through the expand_field_assignment call i2src
(ior:TI (and:TI (reg/v:TI 103 [ f ])
        (const_int -16711681 [0xffffffffff00ffff]))
    (ashift:TI (and:TI (clobber:TI (const_int 0 [0]))
            (const_int 255 [0xff]))
        (const_int 16 [0x10])))
out of this.
i3 is
(insn 20 19 21 2 (set (reg:SI 108 [ f ])
        (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9
114 {*zero_extendqisi2_aarch64}
     (expr_list:REG_DEAD (reg/v:TI 103 [ f ])
        (nil)))
so, I think it is perfectly fine that when i3 only cares about the low 8 bits
of pseudo 103 that it figures out that it is just the low 8 bits of the
original pseudo 103, not ored with anything else, because (unsigned char)
((whatever & 255) << 16) is 0.  So, I don't see anything wrong on i2 -> i3
combination turning it into
(insn 20 19 21 2 (set (reg:SI 108 [ f ])
        (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9
114 {*zero_extendqisi2_aarch64}
     (nil))
In particular, it is combine_simplify_rtx that is called on:
(zero_extend:SI (subreg:QI (ior:TI (and:TI (reg/v:TI 103 [ f ])
                (const_int -16711681 [0xffffffffff00ffff]))
            (ashift:TI (and:TI (clobber:TI (const_int 0 [0]))
                    (const_int 255 [0xff]))
                (const_int 16 [0x10]))) 0))
which simplifies it into
(and:SI (subreg:SI (reg/v:TI 103 [ f ]) 0)
    (const_int 255 [0xff]))
But, there is also
(debug_insn 18 17 19 2 (var_location:HI c (subreg:HI (ashiftrt:SI
(sign_extend:SI (subreg:HI (reg/v:SI 100 [ c ]) 0))
            (zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) 0))
"pr99830.c":8:5 -1
     (nil))
into which that try_combine propagate_for_debug the (reg/v:TI 103 [ f ])
i2dest and replace it with the i2src mentioned above.
In this case it is similarly used in a (subreg:QI ...) so in theory it could
also optimize into just the low bits of older r103.  Except that
propagate_for_debug uses only simplify-rtx.c APIs and doesn't have
combine_simplify_rtx for it.  But in theory it could also be used in other
contexts in the debug insn too.

Reply via email to