On Thu, Nov 28, 2024 at 8:37 AM Richard Biener <richard.guent...@gmail.com>
wrote:

> On Mon, Nov 25, 2024 at 3:28 AM Philipp Tomsich
> <philipp.toms...@vrull.eu> wrote:
> >
> > Pushed to master with the following fixups:
> >   - new timevar added
> >   - nits addressed
> >   - whitespace fixes
>
> The pass seems to be disabled by default everywhere - I thought we
> decided to avoid adding
> passes like this because they tend to bit-rot quickly and become a
> maintenance burden.
>
> What was the plan here?
>

A patch for enabling ASF for O2 or higher is on the list:
  https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674692.html

BR
Christoph


>
> Richard.
>
> > Philipp.
> >
> >
> > On Mon, 25 Nov 2024 at 03:30, Jeff Law <jeffreya...@gmail.com> wrote:
> > >
> > >
> > >
> > > On 11/9/24 2:48 AM, Konstantinos Eleftheriou wrote:
> > > > From: kelefth <konstantinos.elefther...@vrull.eu>
> > > >
> > > > This pass detects cases of expensive store forwarding and tries to
> avoid them
> > > > by reordering the stores and using suitable bit insertion sequences.
> > > > For example it can transform this:
> > > >
> > > >       strb    w2, [x1, 1]
> > > >       ldr     x0, [x1]      # Expensive store forwarding to larger
> load.
> > > >
> > > > To:
> > > >
> > > >       ldr     x0, [x1]
> > > >       strb    w2, [x1]
> > > >       bfi     x0, x2, 0, 8
> > > >
> > > > Assembly like this can appear with bitfields or type punning /
> unions.
> > > > On stress-ng when running the cpu-union microbenchmark the following
> speedups
> > > > have been observed.
> > > >
> > > >    Neoverse-N1:      +29.4%
> > > >    Intel Coffeelake: +13.1%
> > > >    AMD 5950X:        +17.5%
> > > >
> > > > The transformation is rejected on cases that would cause
> store_bit_field
> > > > to generate subreg expressions on different register classes.
> > > > Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c
> contain
> > > > such cases and have been marked as XFAIL.
> > > >
> > > > There is a special handling for machines with BITS_BIG_ENDIAN !=
> > > > BYTES_BIG_ENDIAN. The need for this came up from an issue in H8
> > > > architecture, which uses big-endian ordering, but BITS_BIG_ENDIAN
> > > > is false. In that case, the START parameter of store_bit_field
> > > > needs to be calculated from the end of the destination register.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >       * Makefile.in (OBJS): Add avoid-store-forwarding.o.
> > > >       * common.opt (favoid-store-forwarding): New option.
> > > >       * common.opt.urls: Regenerate.
> > > >       * doc/invoke.texi: New param store-forwarding-max-distance.
> > > >       * doc/passes.texi: Document new pass.
> > > >       * doc/tm.texi: Regenerate.
> > > >       * doc/tm.texi.in: Document new pass.
> > > >       * params.opt (store-forwarding-max-distance): New param.
> > > >       * passes.def: Add pass_rtl_avoid_store_forwarding before
> > > >       pass_early_remat.
> > > >       * target.def (avoid_store_forwarding_p): New DEFHOOK.
> > > >       * target.h (struct store_fwd_info): Declare.
> > > >       * targhooks.cc (default_avoid_store_forwarding_p): New
> function.
> > > >       * targhooks.h (default_avoid_store_forwarding_p): Declare.
> > > >       * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
> > > >       * avoid-store-forwarding.cc: New file.
> > > >       * avoid-store-forwarding.h: New file.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >       * gcc.target/aarch64/avoid-store-forwarding-1.c: New test.
> > > >       * gcc.target/aarch64/avoid-store-forwarding-2.c: New test.
> > > >       * gcc.target/aarch64/avoid-store-forwarding-3.c: New test.
> > > >       * gcc.target/aarch64/avoid-store-forwarding-4.c: New test.
> > > >       * gcc.target/aarch64/avoid-store-forwarding-5.c: New test.
> > > >       * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c:
> New test.
> > > >          * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c:
> New test.
> > > >
> > > > Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
> > > > Signed-off-by: Konstantinos Eleftheriou <
> konstantinos.elefther...@vrull.eu>
> > > >
> > > > Series-version: 8
> > > >
> > > > Series-changes: 8
> > > >       - Fix store_bit_field call for big-endian targets, where
> > > >         BITS_BIG_ENDIAN is false.
> > > >       - Handle store_forwarding_max_distance = 0 as a special case
> that
> > > >         disables cost checks for avoid-store-forwarding.
> > > >       - Update testcases for AArch64 and add testcases for x86-64.
> > > >
> > > > Series-changes: 7
> > > >       - Fix bug when copying back the load register, in the case
> that the
> > > >         load is eliminated.
> > > >
> > > > Series-changes: 6
> > > >       - Reject the transformation on cases that would cause
> store_bit_field
> > > >         to generate subreg expressions on different register classes.
> > > >         Files avoid-store-forwarding-4.c and
> avoid-store-forwarding-5.c
> > > >            contain such cases and have been marked as XFAIL.
> > > >       - Use optimize_bb_for_speed_p instead of
> optimize_insn_for_speed_p.
> > > >       - Inline and remove get_load_mem.
> > > >       - New implementation for is_store_forwarding.
> > > >       - Refactor the main loop in avoid_store_forwarding.
> > > >       - Avoid using the word 'forwardings'.
> > > >       - Use lowpart_subreg instead of validate_subreg +
> gen_rtx_subreg.
> > > >       - Don't use df_insn_rescan where not needed.
> > > >       - Change order of emitting stores and bit insert instructions.
> > > >       - Check and reject loads for which the dest register overlaps
> with src.
> > > >       - Remove unused variable.
> > > >       - Change some gen_mov_insn function calls to gen_rtx_SET.
> > > >       - Subtract the cost of eliminated load, instead of 1, for the
> total cost.
> > > >       - Use delete_insn instead of set_insn_deleted.
> > > >       - Regenerate common.opt.urls.
> > > >       - Add some more comments.
> > > >
> > > > Series-changes: 5
> > > >       - Fix bug with BIG_ENDIAN targets.
> > > >       - Fix bug with unrecognized instructions.
> > > >       - Fix / simplify pass init/fini.
> > > >
> > > > Series-changes: 4
> > > >       - Change pass scheduling to run after sched1.
> > > >       - Add target hook to decide whether a store forwarding instance
> > > >         should be avoided or not.
> > > >       - Fix bugs.
> > > >
> > > > Series-changes: 3
> > > >       - Only emit SUBREG after calling validate_subreg.
> > > >       - Fix memory corruption due to vec self-reference.
> > > >       - Fix bitmap_bit_in_range_p ICE due to BLKMode.
> > > >       - Reject MEM to MEM sets.
> > > >       - Add get_load_mem comment.
> > > >       - Add new testcase.
> > > >
> > > > Series-changes: 2
> > > >       - Allow modes that are not scalar_int_mode.
> > > >       - Introduce simple costing to avoid unprofitable
> transformations.
> > > >       - Reject bit insert sequences that spill to memory.
> > > >       - Document new pass.
> > > >       - Fix and add testcases.
> > > > ---
> > >
> > > > +namespace {
> > > > +
> > > > +const pass_data pass_data_avoid_store_forwarding =
> > > > +{
> > > > +  RTL_PASS, /* type.  */
> > > > +  "avoid_store_forwarding", /* name.  */
> > > > +  OPTGROUP_NONE, /* optinfo_flags.  */
> > > > +  TV_NONE, /* tv_id.  */
> > > > +  0, /* properties_required.  */
> > > > +  0, /* properties_provided.  */
> > > > +  0, /* properties_destroyed.  */
> > > > +  0, /* todo_flags_start.  */
> > > > +  TODO_df_finish /* todo_flags_finish.  */
> > > > +};
> > > Probably want a TV entry for store forwarding.  While it shouldn't be a
> > > big time sink, we've seen other passes that should be efficient go nuts
> > > in some cases (extension elimination being the most recent example).
> > >
> > >
> > >
> > >
> > > > +
> > > > +/* Try to modify BB so that expensive store forwarding cases are
> avoided.  */
> > > > +
> > > > +void store_forwarding_analyzer::avoid_store_forwarding (basic_block
> bb)
> > >
> > > Formatting nit.  Put the return type on its own line so that the
> > > function name always starts on column 0 of its own line.
> > >
> > >
> > >
> > > > +
> > > > +/* Update pass statistics.  */
> > > > +
> > > > +void store_forwarding_analyzer::update_stats (function *fn)
> > > Similarly.
> > >
> > >
> > > OK with the new timevar and two formatting nits.  Thanks for your
> > > patience on this.
> > >
> > > Jeff
> > >
> > >
>

Reply via email to