On Tue, Oct 10, 2023 at 5:15 PM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 10/10/23 05:59, Manolis Tsamis wrote: > > >> It's a code quality issue as long as we don't transform the code into > >> movl $0, -18874240, at which point it would become a correctness issue. > >> > > Ok, thanks for pointing that out as I thought that movl $0, -18874240 > > and movl $0, -18874240(eax) with eax == 0 were the same. > It's a quirk of x32 related to sign extension of the address. Certainly > not obvious! But it's better than the PA where: > > (mem (plus (reg A) (reg B))) > > Is semantically different than > > (mem (plus (reg B) (reg A))) > > > Due to the implicit segment register selection that happens on the high > bits of the base register rather than the high bits of the effective > address. > I see, thanks for the additional info
> > > > >> > >>> > >>> With regards to the "recognize that the base register is 0", that > >>> would be nice but how would we recognise that? f-m-o can only > >>> calculate the folded offset but that is not enough to prove that the > >>> base register is zero or not. > >> It's a chain of insns that produce an address and use it in the memory > >> reference. We essentially changed the first insn in the chain from movl > >> -18874240, %eax into movl 0, %eax. So we'd have to someone note that > >> the base register in the memory reference has the value zero in the > >> chain of instructions. That may or may not be reasonable to do. > >> > > Ok, do you believe it would be worthwhile to add a REG_EQ note or > > something similar? I assume some REG_EQ notes will be added from the > > following cprop-hardreg pass too? > I think it really depends on the degree of difficulty and whether or not > we think there are other cases like this one lurking. > > Given this scenario can only happen with an absolute address, I would > probably explore if there's a way to trivially detect this case, but I > wouldn't spend a lot of time on it. > > And I wasn't necessarily thinking of a note in the RTL, just a bit of > state within f-m-o when updating an arithmetic chain so that we could > determine that the final instruction was a memory reference to an > absolute address. > Ok, although I didn't include this in v7, I'll do some exploration for a possible future change. > > > > > > >> You could use the costing model to cost the entire sequence > >> before/after. There's an interface to walk a sequence and return a > >> cost. In the case of f-m-o the insns are part of the larger chain, so > >> we'd need a different API. > >> > >> The other option would be to declare this is known, but not important > >> issue. I would think that it's going to be rare to have absolute > >> addresses and x32 isn't really used much. The combination of the two > >> should be exceedingly rare. Hence my willingness to use > >> -fno-fold-mem-offsets in the test. > >> > > Yes, I now better understand why you suggested adding > > -fno-fold-mem-offsets to the testcase. Or we could also make the test > > not fail in this case where the memory access has the base register, > > as there's no correctness issue. > :) > > > > > > Then, going back to the code quality regression, wouldn't things be > > much better if we would emit xor eax, eax instead of mov eax, 0? In > > that case xor eax, eax should be 2 bytes instead of 5 for mov eax, 0 > > and hence the code size difference should be trivial. Then we can keep > > f-m-o as is, including this testcase. > The selection of xor vs mov imm is handled elsewhere. On some uarchs > the former is preferred, on others the latter. > > If you think about xor %eax,%eax as an example. Unless you special case > that scenario in your cpu front-end, the xor will have a data dependency > on any prior set of %eax which inhibits ILP. > > > > > > Is there a way to emit a taret-specific optimized register zero insn? > > If so I'll use that and adjust the testcase as needed, and I think > > we're done with this one. > I can live with just adjusting the testcase. Let's go with that. > Done! Sent as v7 together with the recognised -> recognized change and INSN_CODE(insn) = -1 for the bug you reported. Thanks, Manolis > jeff