Alex Coplan <alex.cop...@arm.com> writes: > Hi, > > This is a v3 patch which is rebased on top of the SME changes. > Otherwise it is the same as v2, posted here: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html > > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? > > Thanks, > Alex > > -- >8 -- > > Thus far the writeback forms of ldp/stp have been exclusively used in > prologue and epilogue code for saving/restoring of registers to/from the > stack. > > As such, forms of ldp/stp that weren't needed for prologue/epilogue code > weren't supported by the aarch64 backend. This patch generalizes the > load/store pair writeback patterns to allow: > > - Base registers other than the stack pointer. > - Modes that weren't previously supported. > - Combinations of distinct modes provided they have the same size. > - Pre/post variants that weren't previously needed in prologue/epilogue > code. > > We make quite some effort to avoid a combinatorial explosion in the > number of patterns generated (and those in the source) by making > extensive use of special predicates. > > An updated version of the upcoming ldp/stp pass can generate the > writeback forms, so this patch is motivated by that. > > This patch doesn't add zero-extending or sign-extending forms of the > writeback patterns; that is left for future work. > > gcc/ChangeLog: > > * config/aarch64/aarch64-protos.h (aarch64_ldpstp_operand_mode_p): > Declare. > * config/aarch64/aarch64.cc (aarch64_gen_storewb_pair): Build RTL > directly instead of invoking named pattern. > (aarch64_gen_loadwb_pair): Likewise. > (aarch64_ldpstp_operand_mode_p): New. > * config/aarch64/aarch64.md (loadwb_pair<GPI:mode>_<P:mode>): Replace > with > ... > (*loadwb_post_pair_<ldst_sz>): ... this. Generalize as described > in cover letter. > (loadwb_pair<GPF:mode>_<P:mode>): Delete (superseded by the > above). > (*loadwb_post_pair_16): New. > (*loadwb_pre_pair_<ldst_sz>): New. > (loadwb_pair<TX:mode>_<P:mode>): Delete. > (*loadwb_pre_pair_16): New. > (storewb_pair<GPI:mode>_<P:mode>): Replace with ... > (*storewb_pre_pair_<ldst_sz>): ... this. Generalize as > described in cover letter. > (*storewb_pre_pair_16): New. > (storewb_pair<GPF:mode>_<P:mode>): Delete. > (*storewb_post_pair_<ldst_sz>): New. > (storewb_pair<TX:mode>_<P:mode>): Delete. > (*storewb_post_pair_16): New. > * config/aarch64/predicates.md (aarch64_mem_pair_operator): New. > (pmode_plus_operator): New. > (aarch64_ldp_reg_operand): New. > (aarch64_stp_reg_operand): New.
OK, thanks, although: > +;; q-register variant of the above > +(define_insn "*loadwb_pre_pair_16" > + [(set (match_operand 0 "pmode_register_operand" "=&rk") > + (match_operator 8 "pmode_plus_operator" [ > + (match_operand 1 "pmode_register_operand" "0") > + (match_operand 4 "const_int_operand")])) > + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") > + (match_operator 6 "memory_operand" [ > + (match_operator 10 "pmode_plus_operator" [ > + (match_dup 1) > + (match_dup 4) > + ])])) > + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") > + (match_operator 7 "memory_operand" [ > + (match_operator 9 "pmode_plus_operator" [ > + (match_dup 1) > + (match_operand 5 "const_int_operand") > + ])]))] > + "TARGET_FLOAT > + && aarch64_mem_pair_offset (operands[4], TImode) > + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16)" > + "ldp\t%q2, %q3, [%0, %4]!" > [(set_attr "type" "neon_ldp_q")] ...I think this reads more naturally with the numbering of 9 and 10 swapped. OK either way. Sorry for causing the rebase to be necessary. Richard