On Wed, Jul 22, 2020 at 12:03 PM Andrea Corallo <andrea.cora...@arm.com> wrote:
>
> Hi all,
>
> I'd like to submit the following two patches implementing a new AArch64
> specific back-end pass that helps optimize branch-dense code, which can
> be a bottleneck for performance on some Arm cores.  This is achieved by
> padding out the branch-dense sections of the instruction stream with
> nops.
>
> The original patch was already posted some time ago:
>
> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg200721.html
>
> This follows up splitting as suggested in two patches, rebasing on
> master and implementing the suggestions of the first code review.
>
> This first patch implements the addition of a new RTX instruction class
> FILLER_INSN, which has been white listed to allow placement of NOPs
> outside of a basic block.  This is to allow padding after unconditional
> branches.  This is favorable so that any performance gained from
> diluting branches is not paid straight back via excessive eating of
> nops.
>
> It was deemed that a new RTX class was less invasive than modifying
> behavior in regards to standard UNSPEC nops.
>
> 1/2 is requirement for 2/2.  Please see this the cover letter of this last
> for more details on the pass itself.

I wonder if such effect of instructions on the pipeline can be modeled
in the DFA and thus whether the scheduler could issue (always ready)
NOPs?

I also wonder whether such optimization is better suited for the assembler
which should know instruction lengths and alignment in a more precise
way and also would know whether extra nops make immediates too large
for pc relative things like short branches or section anchor accesses
(or whatever else)?

Richard.

> Regards
>
>   Andrea
>
> gcc/ChangeLog
>
> 2020-07-17  Andrea Corallo  <andrea.cora...@arm.com>
>             Carey Williams  <carey.willi...@arm.com>
>
>         * cfgbuild.c (inside_basic_block_p): Handle FILLER_INSN.
>         * cfgrtl.c (rtl_verify_bb_layout): Whitelist FILLER_INSN outside
>         basic blocks.
>         * coretypes.h: New rtx class.
>         * emit-rtl.c (emit_filler_after): New function.
>         * rtl.def (FILLER_INSN): New rtl define.
>         * rtl.h (rtx_filler_insn): Define new structure.
>         (FILLER_INSN_P): New macro.
>         (is_a_helper <rtx_filler_insn *>::test): New test helper for
>         rtx_filler_insn.
>         (emit_filler_after): New extern.
>         * target-insns.def: Add target insn definition.

Reply via email to