Hi!

On Mon, Sep 06, 2021 at 12:32:13PM +0100, Roger Sayle wrote:
> I think the current documentation is sufficient.  During compilation, GCC's
> combine pass will often substitute a register with an expression defining
> it's value, and then attempt to simplify it.  As you point out, this often
> produces
> (temporary intermediate) expressions with poorly defined semantics.

Not "poorly defined".  The documentation says (in rtl.texi)

@findex subreg
@item (subreg:@var{m1} @var{reg:m2} @var{bytenum})

@code{subreg} expressions are used to refer to a register in a machine
mode other than its natural one, or to refer to one register of
a multi-part @code{reg} that actually refers to several registers.

It goes on to say there also are subregs of mem currently; it
exhaustively lists all things you can take a subreg of, what the
different semantics of those are, how the semantics are further
"modified" (read: completely different) if some RTL flags are set, etc.

The instruction combiner very often creates invalid RTL (only
structurally valid, like, no missing operands).  Invalid RTL is supposed
to never match (because backends will not have patterns that match
these).  combine even creates invalid RTL on purpose (a clobber of
const0_rtx) to ensure its result is deemed invalid, when it wants to
abort a combination attempt for some reason.

> The
> purpose of my patch is to avoid (constructing) SUBREGs that have TRUNCATE
> as an argument (that some folks consider undefined) and instead simplify
> them to either a single TRUNCATE or a SUBREG of a REG, both of which are
> well defined.  I'm making/helping the implementation match the
> documentation.

But this should never make it to simplify-rtx at all.  You can only
validly do such transformations in combine, because you need to know
what insns you started with etc.

simplify-rtx is a part of combine, sure, but it is used from other
contexts as well nowadays.  If you can safely do this from combine,
you can do it in combine.

> Trying 10 -> 15:
>    10: r29:QI=trunc(r32:SI)
>       REG_DEAD r32:SI
>    15: r38:HI=r29:QI#0
>       REG_DEAD r29:QI
> Failed to match this instruction:
> (set (reg:HI 38)
>     (subreg:HI (truncate:QI (reg:SI 32)) 0))
> 
> with this patch:
> 
> Trying 10 -> 15:
>    10: r29:QI=trunc(r32:SI)
>       REG_DEAD r32:SI
>    15: r38:HI=r29:QI#0
>       REG_DEAD r29:QI
> Successfully matched this instruction:
> (set (reg:HI 38)
>     (truncate:HI (reg:SI 32)))
> allowing combination of insns 10 and 15
> original costs 4 + 12 = 16
> replacement cost 4
> deferring deletion of insn with uid = 10.

(It appears costs are a bit off on your target?  Or is HImode
especially expensive for some reason?)

There is no subreg of a non-reg in there, so this is fine.

> A few lines earlier in this function, it simplifies SUBREGs of CONST_INT,
> which
> would also have poorly defined semantics.  Perhaps your interpretation of
> the
> RTL documentation doesn't strictly apply to this part of the RTL optimizers?

Perhaps this is a bug.  It is if it actually allows subregs of ints as
input.


Segher

Reply via email to