Iain Sandoe <i...@sandoe.co.uk> writes: > Hi Richard, > > (hopefully, my take won’t cloud the issue ….) > >> On 14 Feb 2022, at 16:00, Richard Sandiford via Gcc-patches >> <gcc-patches@gcc.gnu.org> wrote: >> >> Hi Vlad, >> >> Vladimir Makarov via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >>> On 2022-02-14 04:44, Richard Sandiford via Gcc-patches wrote: >>>> Iain Sandoe via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >>>>> Two issues resulted in this PR, which manifests when we force a constant >>>>> into >>>>> memory in LRA (in PIC code on Darwin). The presence of such forced >>>>> constants >>>>> is quite dependent on other RTL optimisations, and it is easy for the >>>>> issue to >>>>> become latent for a specific case. >>>>> >>>>> First, in the Darwin-specific rs6000 backend code, we were not being >>>>> careful >>>>> enough in rejecting invalid symbolic addresses. Specifically, when >>>>> generating >>>>> PIC code, we require a SYMBOL_REF to be wrapped in an >>>>> UNSPEC_MACHOPIC_OFFSET. >>>>> >>>>> Second, LRA was attempting to load a register using an invalid lo_sum >>>>> address. >>>>> >>>>> The LRA changes are approved in the PR by Vladimir, and the RS6000 >>>>> changes are >>>>> Darwin-specific (although, of course, any observations are welcome). >>>>> >>>>> Tested on several lo_sum targets and x86_64 all languages except as noted: >>>>> powerpc64-linux (m32/m64) -D >>>>> powerpc64le-linux -D >>>>> powerpc64-aix -Ada -Go -D >>>>> aarch64-linux -Ada -D >>>>> x86_64-linux all langs -D >>>>> powerpc-darwin9 (master and 11.2) -D -Go. >>>>> >>>>> pushed to master, thanks, >>>>> Iain >>>>> >>>>> Signed-off-by: Iain Sandoe <i...@sandoe.co.uk> >>>>> Co-authored-by: Vladimir Makarov <vmaka...@redhat.com> >>>>> >>>>> PR target/104117 >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> * config/rs6000/rs6000.cc (darwin_rs6000_legitimate_lo_sum_const_p): >>>>> Check for UNSPEC_MACHOPIC_OFFSET wrappers on symbolic addresses when >>>>> emitting PIC code. >>>>> (legitimate_lo_sum_address_p): Likewise. >>>>> * lra-constraints.cc (process_address_1): Do not attempt to emit a reg >>>>> load from an invalid lo_sum address. >>>>> --- >>>>> gcc/config/rs6000/rs6000.cc | 38 +++++++++++++++++++++++++++++++++++-- >>>>> gcc/lra-constraints.cc | 17 ++--------------- >>>>> 2 files changed, 38 insertions(+), 17 deletions(-) >>>>> >>>>> […] >>>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc >>>>> index fdff9e0720a..c700c3f4578 100644 >>>>> --- a/gcc/lra-constraints.cc >>>>> +++ b/gcc/lra-constraints.cc >>>>> @@ -3625,21 +3625,8 @@ process_address_1 (int nop, bool check_only_p, >>>>> *ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr); >>>>> if (!valid_address_p (op, &ad, cn)) >>>>> { >>>>> - /* Try to put lo_sum into register. */ >>>>> - insn = emit_insn (gen_rtx_SET >>>>> - (new_reg, >>>>> - gen_rtx_LO_SUM (Pmode, new_reg, >>>>> addr))); >>>>> - code = recog_memoized (insn); >>>>> - if (code >= 0) >>>>> - { >>>>> - *ad.inner = new_reg; >>>>> - if (!valid_address_p (op, &ad, cn)) >>>>> - { >>>>> - *ad.inner = addr; >>>>> - code = -1; >>>>> - } >>>>> - } >>>>> - >>>>> + *ad.inner = addr; /* Punt. */ >>>>> + code = -1; >>>>> } >>>>> } >>>>> if (code < 0) >>>> Could you go into more details about this? Why is it OK to continue >>>> to try: >>>> >>>> (lo_sum new_reg addr) >>>> >>>> directly as an address (the context at the top of the hunk), but not try >>>> moving the lo_sum into a register? They should be semantically equivalent, >>>> so it seems that if one is wrong, the other would be too. >>>> >>> Hi, Richard. Change LRA is mine and I approved it for Iain's patch. >>> >>> I think there is no need for this code and it is misleading. If >>> 'mem[low_sum]' does not work, I don't think that 'reg=low_sum;mem[reg]' >>> will help for any existing target. As machine-dependent code for any >>> target most probably (for ppc64 darwin it is exactly the case) checks >>> address only in memory, it can wrongly accept wrong address by reloading >>> it into reg and use it in memory. So these are my arguments for the >>> remove this code from process_address_1. >> >> I'm probably making too much of this, but: >> >> I think the code is potentially useful in that existing targets do forbid >> forbid lo_sum addresses in certain contexts (due to limited offset range) >> while still wanting lo_sum to be used to be load the address. If we >> handle the high/lo_sum split in generic code then we have more chance >> of being able to optimise things. So it feels like this is setting an >> unfortunate precedent. >> >> I still don't understand what went wrong before though (the PR trail >> was a bit too long to process :-)). Is there a case where >> (lo_sum (high X) X) != X? If so, that seems like a target bug to me. > > If X is an invalid address (in this case for PIC code a SYMBOL_REF is not > valid without an UNSPEC wrapper) - then (high X) and (lo_sum (high X) X) > are also invalid. AFAICT the target never (this late in the process) gets > the opportunity to apply a transform to validate the address. > >> Or does the target accept (set R1 (lo_sum R2 X)) for an X that cannot >> be split into a HIGH/LO_SUM pair? I'd argue that's a target bug too. > > once again if X is plain invalid - loading it into a register or splitting it > does not solve that - we’d need to call the target’s address legitimizer. > > Of course, if we conclude that it is a target bug, I’ll try to fix it up .
But if the target treats: (set R2 (high X)) (set R1 (lo_sum R2 X)) as an invalid way of setting R1 to X (e.g. because X is plain invalid), then recog should fail on at least one instruction, preferably both. So yeah, this does feel like a target bug to me. We shouldn't assume that lo_sum/high splits will only be generated by the target. Target-independent code can generate them “spontaneously”, provided that it validates the result. (Combine does this too, for example.) It looks like the LRA code might have been missing a HIGH in the fallback case, but it doesn't sound like that was the bug that was being fixed. Thanks, Richard