Richard Biener <richard.guent...@gmail.com> writes:
> On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich <philipp.toms...@vrull.eu> 
> wrote:
>>
>> On Wed, 6 Dec 2023 at 23:32, Richard Biener <richard.guent...@gmail.com> 
>> wrote:
>> >
>> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis
>> > <manos.anagnosta...@vrull.eu> wrote:
>> > >
>> > > This is an RTL pass that detects store forwarding from stores to larger 
>> > > loads (load pairs).
>> > >
>> > > This optimization is SPEC2017-driven and was found to be beneficial for 
>> > > some benchmarks,
>> > > through testing on ampere1/ampere1a machines.
>> > >
>> > > For example, it can transform cases like
>> > >
>> > > str  d5, [sp, #320]
>> > > fmul d5, d31, d29
>> > > ldp  d31, d17, [sp, #312] # Large load from small store
>> > >
>> > > to
>> > >
>> > > str  d5, [sp, #320]
>> > > fmul d5, d31, d29
>> > > ldr  d31, [sp, #312]
>> > > ldr  d17, [sp, #320]
>> > >
>> > > Currently, the pass is disabled by default on all architectures and 
>> > > enabled by a target-specific option.
>> > >
>> > > If deemed beneficial enough for a default, it will be enabled on 
>> > > ampere1/ampere1a,
>> > > or other architectures as well, without needing to be turned on by this 
>> > > option.
>> >
>> > What is aarch64-specific about the pass?
>> >
>> > I see an increasingly large number of target specific passes pop up 
>> > (probably
>> > for the excuse we can generalize them if necessary).  But GCC isn't LLVM
>> > and this feels like getting out of hand?
>>
>> We had an OK from Richard Sandiford on the earlier (v5) version with
>> v6 just fixing an obvious bug... so I was about to merge this earlier
>> just when you commented.
>>
>> Given that this had months of test exposure on our end, I would prefer
>> to move this forward for GCC14 in its current form.
>> The project of replacing architecture-specific store-forwarding passes
>> with a generalized infrastructure could then be addressed in the GCC15
>> timeframe (or beyond)?
>
> It's up to target maintainers, I just picked this pass (randomly) to make this
> comment (of course also knowing that STLF fails are a common issue on
> pipelined uarchs).

I agree there's scope for making some of this target-independent.

One vague thing I've been wondering about is whether, for some passes
like these, we should use inheritance rather than target hooks.  So in
this case, the target-independent code would provide a framework for
iterating over the function and testing for forwarding, but the target
would ultimately decide what to do with that information.  This would
also make it easier for targets to add genuinely target-specific
information to the bookkeeping structures.

In case it sounds otherwise, that's supposed to be more than
just a structural C++-vs-C thing.  The idea is that we'd have
a pass for "resolving store forwarding-related problems",
but the specific goals would be mostly (or at least partially)
target-specific rather than target-independent.

I'd wondered the same thing about the early-ra pass that we're
adding for SME.  Some of the framework could be generalised and
made target-independent, but the main purpose of the pass (using
strided registers with certain patterns and constraints) is highly
target-specific.

Thanks,
Richard

Reply via email to