On Tue, Sep 27, 2022 at 9:54 PM Jeff Law <j...@ventanamicro.com> wrote: > > > This is another minor improvement to coremark. I suspect this only > improves code size as the load-immediate was likely issuing with the ret > statement on multi-issue machines. > > > Basically we're failing to utilize conditional equivalences during the > post-reload CSE pass. So if a particular block is only reached when a > certain condition holds (say for example a4 == 0) and the block has an > assignment like a4 = 0, we would fail to eliminate the unnecessary > assignment.
conditional equivalences on RTL - ick ;) I'm not familiar with RTL pattern matching so somebody else has to comment on that, but + /* If this is not the first time through, then + verify the source and destination match. */ + else if (dest == XEXP (cond, 0) && src == XEXP (cond, 1)) + ; shouldn't you restrict dest/src somehow? It might be a MEM? The way you create the fake insn suggests only REG_P dest are OK (not SUBREGs for example?)? Should you use rtx_equal_p (not using that possibly exempts MEM, but being more explicit would be nice). Should you restrict this to MODE_INT compares? Richard. > > So the way this works, as we enter each block in reload_cse_regs_1 we > look at the block's predecessors to see if all of them have the same > implicit assignment. If they do, then we create a dummy insn > representing that implicit assignment. > > > Before processing the first real insn, we enter the implicit assignment > into the cselib hash tables. This deferred action is necessary > because of CODE_LABEL handling in cselib -- when it sees a CODE_LABEL it > wipes state. So we have to add the implicit assignment after processing > the (optional) CODE_LABEL, but before processing real insns. > > > Note we have to walk all the block's predecessors to verify they all > have the same implicit assignment. That could potentially be expensive, > so we limit it to cases where there are only a few predecessors. For > reference on x86_64, 81% of the cases where implicit assignments can be > found are for single predecessor blocks. 96% have two preds, 99.1% have > 3 preds, 99.6% have 4 preds, 99.8% have 5 preds and so-on. While there > were cases where all 19 preds had the same implicit assignment capturing > those cases just doesn't seem terribly important. I put the clamp at 3 > preds. If folks think it's important, I could certainly make that a > PARAM. > > > Bootstrapped and regression tested on x86. Bootstrapped on riscv as well. > > > OK for the trunk? > > > Jeff > >