Richard Sandiford <richard.sandif...@arm.com> writes: > Richard Biener <rguent...@suse.de> writes: >> On Fri, 4 Apr 2025, Richard Sandiford wrote: >> >>> Another problem in PR101523 was that, after each successful 2->2 >>> combination attempt, distribute_links would search further and further >>> for the next combinable use of the i2 destination. Each search would >>> start at i2 itself, making the search quadratic in the worst case. >>> >>> In a 2->2 combination, if i2 is unchanged, the search can start at i3 >>> instead of i2. The same thing applies to i2 when distributing i2's >>> links, since the only changes to earlier instructions are the deletion >>> of i0 and i1. >>> >>> This change, combined with the previous split_i2i3 patch, gives a >>> 34.6% speedup in combine for the testcase in PR101523. Combine >>> goes from being 41% to 34% of compile time. >> >> From my analysis this patch is OK (I do wonder why not always >> starting from iN when distributing links for iN, but I guess >> that combining into parallel & splitting can actually move >> link sources/destinations in odd ways > > Right. In particular, if we split part of i3 into i2, some log > links can move up from i3 to i2. I think it would be safe to > pass iN unconditionally and then use LUIDs to check whether the
...I meant pass i2 unconditionally for i3links, on the basis that we certainly don't need to search before i2. > passed-in instruction is before or after the definition, picking > the later of the two. But that felt like too much of a rabiit hole. > > Richard > > >> - I don't claim I fully understand this). >> >> Richard. >> >>> gcc/ >>> PR testsuite/116398 >>> * combine.cc (distribute_links): Take an optional start point. >>> (try_combine): If only i3 has changed, only distribute i3's links, >>> not i2's. Start the search for the new use from i3 rather than >>> from the definition instruction. Likewise start the search for >>> the new use from i2 when distributing i2's links. >>> --- >>> gcc/combine.cc | 27 +++++++++++++++++++-------- >>> 1 file changed, 19 insertions(+), 8 deletions(-) >>> >>> diff --git a/gcc/combine.cc b/gcc/combine.cc >>> index 0664f8dd433..9eae1a0111e 100644 >>> --- a/gcc/combine.cc >>> +++ b/gcc/combine.cc >>> @@ -472,7 +472,7 @@ static void move_deaths (rtx, rtx, int, rtx_insn *, rtx >>> *); >>> static bool reg_bitfield_target_p (rtx, rtx); >>> static void distribute_notes (rtx, rtx_insn *, rtx_insn *, rtx_insn *, >>> rtx, rtx, rtx); >>> -static void distribute_links (struct insn_link *); >>> +static void distribute_links (struct insn_link *, rtx_insn * = nullptr); >>> static void mark_used_regs_combine (rtx); >>> static void record_promoted_value (rtx_insn *, rtx); >>> static bool unmentioned_reg_p (rtx, rtx); >>> @@ -4590,10 +4590,15 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn >>> *i1, rtx_insn *i0, >>> NULL_RTX, NULL_RTX, NULL_RTX); >>> } >>> >>> - distribute_links (i3links); >>> - distribute_links (i2links); >>> - distribute_links (i1links); >>> - distribute_links (i0links); >>> + if (only_i3_changed) >>> + distribute_links (i3links, i3); >>> + else >>> + { >>> + distribute_links (i3links); >>> + distribute_links (i2links, i2); >>> + distribute_links (i1links); >>> + distribute_links (i0links); >>> + } >>> >>> if (REG_P (i2dest)) >>> { >>> @@ -14984,10 +14989,13 @@ distribute_notes (rtx notes, rtx_insn *from_insn, >>> rtx_insn *i3, rtx_insn *i2, >>> >>> /* Similarly to above, distribute the LOG_LINKS that used to be present on >>> I3, I2, and I1 to new locations. This is also called to add a link >>> - pointing at I3 when I3's destination is changed. */ >>> + pointing at I3 when I3's destination is changed. >>> + >>> + If START is nonnull and an insn, we know that the next location for each >>> + link is no earlier than START. */ >>> >>> static void >>> -distribute_links (struct insn_link *links) >>> +distribute_links (struct insn_link *links, rtx_insn *start) >>> { >>> struct insn_link *link, *next_link; >>> >>> @@ -15053,7 +15061,10 @@ distribute_links (struct insn_link *links) >>> I3 to I2. Also note that not much searching is typically done here >>> since most links don't point very far away. */ >>> >>> - for (insn = NEXT_INSN (link->insn); >>> + insn = start; >>> + if (!insn || NOTE_P (insn)) >>> + insn = NEXT_INSN (link->insn); >>> + for (; >>> (insn && (this_basic_block->next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun) >>> || BB_HEAD (this_basic_block->next_bb) != insn)); >>> insn = NEXT_INSN (insn)) >>>