Alex Coplan <alex.cop...@arm.com> writes: > Hi Richard, > > On 29/01/2025 16:44, Richard Sandiford wrote: >> The PR showed two issues with pair-fusion. The first is that the pass >> treated stack pointer deallocations as ordinary register updates, and so >> might move them earlier than another stack access (through a different >> base register) that doesn't alias the pair candidate. >> >> The simplest fix for that seems to be to prevent the stack deallocation >> from being moved. This part might (or might not) be a latent source of >> wrong code and so worth backporting in some form. (The patch as-is >> won't work for GCC 14.) >> >> The second issue only started with r15-6551, which added a memory >> write to stack allocations and deallocations. We should use the >> existing tombstone mechanism to preserve the associated memory >> definition. (Deleting definitions immediately would have quadratic >> complexity in the worst case.) > > Thanks a lot for taking care of this while I was away. I just have a > couple of comments below. > >> >> Tested on aarch64-linux-gnu. OK to install? >> >> Richard >> >> >> gcc/ >> PR rtl-optimization/118429 >> * pair-fusion.cc (latest_hazard_before): Add an extra parameter >> to say whether the instruction is a load or a store. If the >> instruction is not a load or store and has memory side effects, >> prevent it from being moved earlier. >> (pair_fusion::find_trailing_add): Update call accordingly. >> (pair_fusion_bb_info::fuse_pair): If the trailng addition had >> a memory side-effect, use a tombstone to preserve it. >> >> gcc/testsuite/ >> PR rtl-optimization/118429 >> * gcc.c-torture/compile/pr118429.c: New test. >> --- >> gcc/pair-fusion.cc | 45 ++++++++++++++----- >> .../gcc.c-torture/compile/pr118429.c | 7 +++ >> 2 files changed, 40 insertions(+), 12 deletions(-) >> create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr118429.c >> >> diff --git a/gcc/pair-fusion.cc b/gcc/pair-fusion.cc >> index b6643ca4812..602e572ab6c 100644 >> --- a/gcc/pair-fusion.cc >> +++ b/gcc/pair-fusion.cc >> @@ -573,11 +573,13 @@ pair_fusion_bb_info::track_access (insn_info *insn, >> bool load_p, rtx mem) >> // If IGNORE_INSN is non-NULL, we should further ignore any hazards arising >> // from that insn. >> // >> -// N.B. we ignore any defs/uses of memory here as we deal with that >> separately, >> -// making use of alias disambiguation. >> +// IS_LOAD_STORE is true if INSN is one of the loads or stores in the >> +// candidate pair. We ignore any defs/uses of memory in such instructions >> +// as we deal with that separately, making use of alias disambiguation. >> static insn_info * >> latest_hazard_before (insn_info *insn, rtx *ignore, >> - insn_info *ignore_insn = nullptr) >> + insn_info *ignore_insn = nullptr, >> + bool is_load_store = true) >> { >> insn_info *result = nullptr; >> >> @@ -588,6 +590,10 @@ latest_hazard_before (insn_info *insn, rtx *ignore, >> && find_reg_note (insn->rtl (), REG_EH_REGION, NULL_RTX)) >> return insn->prev_nondebug_insn (); >> >> + if (!is_load_store >> + && accesses_include_memory (insn->defs ())) >> + return insn->prev_nondebug_insn (); > > This seems like it might be a little too restrictive. I agree that it's > a nice and simple way of solving the problem, but wouldn't it be enough > to prevent moving such accesses (stack deallocations) above the latest > preceding def or use of mem? Certainly we don't want to start > attempting alias analysis here, but is the above suggestion not a happy > middle ground (between a simple solution and not overly restricting > optimisation)?
Would it help in practice though? Although it is possible to combine a deallocation with preceding stores, that only happens for dead code, in which case the better optimisation is to delete the stores. If we're combining with loads, the loads would normally be restoring registers for the caller, in which case the loads could be moved forward to the deallocation (since nothing would use or clobber the loaded values between the two points). But yeah, I certainly don't object to doing that if you prefer. The patch is in now though (was asked to fix it while you were away), so it might be easier if you change it to a version that you're happier with (since you can self-approve it). Thanks, Richard