On Tue, Sep 27, 2022 at 9:54 PM Jeff Law <j...@ventanamicro.com> wrote:
>
>
> This is another minor improvement to coremark.   I suspect this only
> improves code size as the load-immediate was likely issuing with the ret
> statement on multi-issue machines.
>
>
> Basically we're failing to utilize conditional equivalences during the
> post-reload CSE pass.  So if a particular block is only reached when a
> certain condition holds (say for example a4 == 0) and the block has an
> assignment like a4 = 0, we would fail to eliminate the unnecessary
> assignment.

conditional equivalences on RTL - ick ;)

I'm not familiar with RTL pattern matching so somebody else has to
comment on that, but

+                     /* If this is not the first time through, then
+                        verify the source and destination match.  */
+                     else if (dest == XEXP (cond, 0) && src == XEXP (cond, 1))
+                       ;

shouldn't you restrict dest/src somehow?  It might be a MEM?
The way you create the fake insn suggests only REG_P dest are OK
(not SUBREGs for example?)?
Should you use rtx_equal_p (not using that possibly exempts MEM,
but being more explicit would be nice).  Should you restrict this to
MODE_INT compares?

Richard.

>
> So the way this works, as we enter each block in reload_cse_regs_1 we
> look at the block's predecessors to see if all of them have the same
> implicit assignment.  If they do, then we create a dummy insn
> representing that implicit assignment.
>
>
> Before processing the first real insn, we enter the implicit assignment
> into the cselib hash tables.    This deferred action is necessary
> because of CODE_LABEL handling in cselib -- when it sees a CODE_LABEL it
> wipes state.  So we have to add the implicit assignment after processing
> the (optional) CODE_LABEL, but before processing real insns.
>
>
> Note we have to walk all the block's predecessors to verify they all
> have the same implicit assignment.  That could potentially be expensive,
> so we limit it to cases where there are only a few predecessors.   For
> reference on x86_64, 81% of the cases where implicit assignments can be
> found are for single predecessor blocks.  96% have two preds, 99.1% have
> 3 preds, 99.6% have 4 preds, 99.8% have 5 preds and so-on.   While there
> were cases where all 19 preds had the same implicit assignment capturing
> those cases just doesn't seem terribly important.   I put the clamp at 3
> preds.    If folks think it's important, I could certainly make that a
> PARAM.
>
>
> Bootstrapped and regression tested on x86.  Bootstrapped on riscv as well.
>
>
> OK for the trunk?
>
>
> Jeff
>
>

Reply via email to