Hi all, It is true that regcprop currently does not propagate sp and hence leela is not optimized, but from what I see this should be something we can address.
The reason that the propagation fails is this check that I have added when I introduced maybe_copy_reg_attrs: else if (REG_POINTER (new_reg) != REG_POINTER (old_reg)) { /* Only a single instance of STACK_POINTER_RTX must exist and we cannot modify it. Allow propagation if REG_POINTER for OLD_REG matches and don't touch ORIGINAL_REGNO and REG_ATTRS. */ return NULL_RTX; } To be honest I did add this back then just to be on the safe side of whether a mismatch in REG_POINTER after propagation would be an issue (since the original regcprop had caused enough problems). I see two ways to solve this and make fmo able to optimize leela as well: 1) Remove the REG_POINTER check in regcprop if that is safe. My understanding is that REG_POINTER is used as a hint and there would be no correctness issues. 2) Mark the corresponding registers with REG_POINTER. I'm not sure where that is supposed to happen. Since the instructions look like this: (insn 113 11 16 2 (set (reg:DI 15 a5 [226]) (reg/f:DI 2 sp)) 179 {*movdi_64bit} (nil)) I assume that we'd want to mark a5 as REG_POINTER anyway (which is not), and in that case propagation would work. On the other hand if there's no correctness issue w.r.t. REG_POINTER and regcprop then removing the additional check would increase propagation opportunities in general which is also good. Thanks, Manolis On Wed, Aug 2, 2023 at 2:52 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 8/1/23 17:38, Vineet Gupta wrote: > >> > >> Also note that getting FP out of the shift-add sequences is the other > >> key goal of Jivan's work. FP elimination always results in a > >> spill/reload if we have a shift-add insn where one operand is FP. > > > > Hmm, are you saying it should NOT be generating shift-add with SP as > > src, because currently thats exactly what fold FP offset *is* doing and > > is the reason it has 5 less insns. > We should not have shift-add with FP as a source prior to register > allocation because it will almost always generate spill code. > > > jeff