Richard Biener <richard.guent...@gmail.com> writes: > On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich <philipp.toms...@vrull.eu> > wrote: >> >> On Wed, 6 Dec 2023 at 23:32, Richard Biener <richard.guent...@gmail.com> >> wrote: >> > >> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis >> > <manos.anagnosta...@vrull.eu> wrote: >> > > >> > > This is an RTL pass that detects store forwarding from stores to larger >> > > loads (load pairs). >> > > >> > > This optimization is SPEC2017-driven and was found to be beneficial for >> > > some benchmarks, >> > > through testing on ampere1/ampere1a machines. >> > > >> > > For example, it can transform cases like >> > > >> > > str d5, [sp, #320] >> > > fmul d5, d31, d29 >> > > ldp d31, d17, [sp, #312] # Large load from small store >> > > >> > > to >> > > >> > > str d5, [sp, #320] >> > > fmul d5, d31, d29 >> > > ldr d31, [sp, #312] >> > > ldr d17, [sp, #320] >> > > >> > > Currently, the pass is disabled by default on all architectures and >> > > enabled by a target-specific option. >> > > >> > > If deemed beneficial enough for a default, it will be enabled on >> > > ampere1/ampere1a, >> > > or other architectures as well, without needing to be turned on by this >> > > option. >> > >> > What is aarch64-specific about the pass? >> > >> > I see an increasingly large number of target specific passes pop up >> > (probably >> > for the excuse we can generalize them if necessary). But GCC isn't LLVM >> > and this feels like getting out of hand? >> >> We had an OK from Richard Sandiford on the earlier (v5) version with >> v6 just fixing an obvious bug... so I was about to merge this earlier >> just when you commented. >> >> Given that this had months of test exposure on our end, I would prefer >> to move this forward for GCC14 in its current form. >> The project of replacing architecture-specific store-forwarding passes >> with a generalized infrastructure could then be addressed in the GCC15 >> timeframe (or beyond)? > > It's up to target maintainers, I just picked this pass (randomly) to make this > comment (of course also knowing that STLF fails are a common issue on > pipelined uarchs).
I agree there's scope for making some of this target-independent. One vague thing I've been wondering about is whether, for some passes like these, we should use inheritance rather than target hooks. So in this case, the target-independent code would provide a framework for iterating over the function and testing for forwarding, but the target would ultimately decide what to do with that information. This would also make it easier for targets to add genuinely target-specific information to the bookkeeping structures. In case it sounds otherwise, that's supposed to be more than just a structural C++-vs-C thing. The idea is that we'd have a pass for "resolving store forwarding-related problems", but the specific goals would be mostly (or at least partially) target-specific rather than target-independent. I'd wondered the same thing about the early-ra pass that we're adding for SME. Some of the framework could be generalised and made target-independent, but the main purpose of the pass (using strided registers with certain patterns and constraints) is highly target-specific. Thanks, Richard