On 11/11/22 09:21, Richard Sandiford via Gcc-patches wrote:
Arm's SME adds a new processor mode called streaming mode.
This mode enables some new (matrix-oriented) instructions and
disables several existing groups of instructions, such as most
Advanced SIMD vector instructions and a much smaller set of SVE
instructions.  It can also change the current vector length.

There are instructions to switch in and out of streaming mode.
However, their effect on the ISA and vector length can't be represented
directly in RTL, so they need to be emitted late in the pass pipeline,
close to md_reorg.

It's sometimes the responsibility of the prologue and epilogue to
switch modes, which means we need to emit the prologue and epilogue
sequences late as well.  (This loses shrink-wrapping and scheduling
opportunities, but that's a price worth paying.)

This patch therefore adds a target hook for forcing prologue
and epilogue insertion to happen later in the pipeline.

Tested on aarch64-linux-gnu (including with a follow-on patch)
and x86_64-linux-gnu.  OK to install?
  I'll ob
Richard


gcc/
        * target.def (use_late_prologue_epilogue): New hook.
        * doc/gccint/target-macros/miscellaneous-parameters.rst: Add
        TARGET_USE_LATE_PROLOGUE_EPILOGUE.
        * doc/gccint/target-macros/tm.rst.in: Regenerate.
        * passes.def (pass_late_thread_prologue_and_epilogue): New pass.
        * tree-pass.h (make_pass_late_thread_prologue_and_epilogue): Declare.
        * function.cc (pass_thread_prologue_and_epilogue::gate): New function.
        (pass_data_late_thread_prologue_and_epilogue): New pass variable.
        (pass_late_thread_prologue_and_epilogue): New pass class.
        (make_pass_late_thread_prologue_and_epilogue): New function.

I'm not sure how we'll enforce the no target independent code motion limitation that this seems to need and the exception made for reorg is hackish in that it appears we just rely on the fact that reorg isn't run for the one target where this matters.  That does make me wonder if we should future proof this ever so slightly -- is there a reasonably easy way to fail if a target were to define delay slots and the need for late prologue/epilogue?  If so, that seems advisable.


No objection to the meat of the patch, just wondering a bit about the additional sanity checking we can do...


Jeff

Reply via email to