Alex Coplan <alex.cop...@arm.com> writes:
> Hi,
>
> This is a v3 patch which is rebased on top of the SME changes.
> Otherwise it is the same as v2, posted here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> -- >8 --
>
> Thus far the writeback forms of ldp/stp have been exclusively used in
> prologue and epilogue code for saving/restoring of registers to/from the
> stack.
>
> As such, forms of ldp/stp that weren't needed for prologue/epilogue code
> weren't supported by the aarch64 backend.  This patch generalizes the
> load/store pair writeback patterns to allow:
>
>  - Base registers other than the stack pointer.
>  - Modes that weren't previously supported.
>  - Combinations of distinct modes provided they have the same size.
>  - Pre/post variants that weren't previously needed in prologue/epilogue
>    code.
>
> We make quite some effort to avoid a combinatorial explosion in the
> number of patterns generated (and those in the source) by making
> extensive use of special predicates.
>
> An updated version of the upcoming ldp/stp pass can generate the
> writeback forms, so this patch is motivated by that.
>
> This patch doesn't add zero-extending or sign-extending forms of the
> writeback patterns; that is left for future work.
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64-protos.h (aarch64_ldpstp_operand_mode_p): 
> Declare.
>         * config/aarch64/aarch64.cc (aarch64_gen_storewb_pair): Build RTL
>         directly instead of invoking named pattern.
>         (aarch64_gen_loadwb_pair): Likewise.
>         (aarch64_ldpstp_operand_mode_p): New.
>         * config/aarch64/aarch64.md (loadwb_pair<GPI:mode>_<P:mode>): Replace 
> with
>         ...
>         (*loadwb_post_pair_<ldst_sz>): ... this. Generalize as described
>         in cover letter.
>         (loadwb_pair<GPF:mode>_<P:mode>): Delete (superseded by the
>         above).
>         (*loadwb_post_pair_16): New.
>         (*loadwb_pre_pair_<ldst_sz>): New.
>         (loadwb_pair<TX:mode>_<P:mode>): Delete.
>         (*loadwb_pre_pair_16): New.
>         (storewb_pair<GPI:mode>_<P:mode>): Replace with ...
>         (*storewb_pre_pair_<ldst_sz>): ... this.  Generalize as
>         described in cover letter.
>         (*storewb_pre_pair_16): New.
>         (storewb_pair<GPF:mode>_<P:mode>): Delete.
>         (*storewb_post_pair_<ldst_sz>): New.
>         (storewb_pair<TX:mode>_<P:mode>): Delete.
>         (*storewb_post_pair_16): New.
>         * config/aarch64/predicates.md (aarch64_mem_pair_operator): New.
>         (pmode_plus_operator): New.
>         (aarch64_ldp_reg_operand): New.
>         (aarch64_stp_reg_operand): New.

OK, thanks, although:

> +;; q-register variant of the above
> +(define_insn "*loadwb_pre_pair_16"
> +  [(set (match_operand 0 "pmode_register_operand" "=&rk")
> +     (match_operator 8 "pmode_plus_operator" [
> +       (match_operand 1 "pmode_register_operand" "0")
> +       (match_operand 4 "const_int_operand")]))
> +   (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w")
> +     (match_operator 6 "memory_operand" [
> +       (match_operator 10 "pmode_plus_operator" [
> +         (match_dup 1)
> +         (match_dup 4)
> +       ])]))
> +   (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w")
> +     (match_operator 7 "memory_operand" [
> +       (match_operator 9 "pmode_plus_operator" [
> +          (match_dup 1)
> +          (match_operand 5 "const_int_operand")
> +       ])]))]
> +  "TARGET_FLOAT
> +   && aarch64_mem_pair_offset (operands[4], TImode)
> +   && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16)"
> +  "ldp\t%q2, %q3, [%0, %4]!"
>    [(set_attr "type" "neon_ldp_q")]

...I think this reads more naturally with the numbering of 9 and 10 swapped.
OK either way.

Sorry for causing the rebase to be necessary.

Richard

Reply via email to