For some reason this patch never showed up on gcc-patches. Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain
> Begin forwarded message: > > From: acsaw...@linux.ibm.com > Subject: [PATCH,rs6000] Make MMA builtins use opaque modes [v2] > Date: November 19, 2020 at 12:58:47 PM CST > To: gcc-patches@gcc.gnu.org > Cc: seg...@kernel.crashing.org, wschm...@linux.ibm.com, > berg...@linux.ibm.com, Aaron Sawdey <acsaw...@linux.ibm.com> > > From: Aaron Sawdey <acsaw...@linux.ibm.com> > > Segher & Bergner - > Thanks for the reviews, here's the updated patch after fixing those things. > We now have an UNSPEC for xxsetaccz, and an accompanying change to > rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to replace it > with a bunch of register moves. > > If bootstrap/regtest looks good, ok for trunk? > > Thanks, > Aaron > > gcc/ > * gcc/config/rs6000/mma.md (unspec): Add assemble/extract UNSPECs. > (movoi): Change to movoo. > (*movpoi): Change to *movoo. > (movxi): Change to movxo. > (*movpxi): Change to *movxo. > (mma_assemble_pair): Change to OO mode. > (*mma_assemble_pair): New define_insn_and_split. > (mma_disassemble_pair): New define_expand. > (*mma_disassemble_pair): New define_insn_and_split. > (mma_assemble_acc): Change to XO mode. > (*mma_assemble_acc): Change to XO mode. > (mma_disassemble_acc): New define_expand. > (*mma_disassemble_acc): New define_insn_and_split. > (mma_<acc>): Change to XO mode. > (mma_<vv>): Change to XO mode. > (mma_<avv>): Change to XO mode. > (mma_<pv>): Change to OO mode. > (mma_<apv>): Change to XO/OO mode. > (mma_<vvi4i4i8>): Change to XO mode. > (mma_<avvi4i4i8>): Change to XO mode. > (mma_<vvi4i4i2>): Change to XO mode. > (mma_<avvi4i4i2>): Change to XO mode. > (mma_<vvi4i4>): Change to XO mode. > (mma_<avvi4i4>): Change to XO mode. > (mma_<pvi4i2>): Change to XO/OO mode. > (mma_<apvi4i2>): Change to XO/OO mode. > (mma_<vvi4i4i4>): Change to XO mode. > (mma_<avvi4i4i4>): Change to XO mode. > * gcc/config/rs6000/predicates.md (input_operand): Allow opaque. > (mma_disassemble_output_operand): New predicate. > * gcc/config/rs6000/rs6000-builtin.def: > Changes to disassemble builtins. > * gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory): > Disallow __vector_pair/__vector_quad as return types. > (rs6000_promote_function_mode): Remove function return type > check because we can't test it here any more. > (rs6000_function_arg): Do not allow __vector_pair/__vector_quad > as as function arguments. > (rs6000_gimple_fold_mma_builtin): > Handle mma_disassemble_* builtins. > (rs6000_init_builtins): Create types for XO/OO modes. > * gcc/config/rs6000/rs6000-modes.def: DElete OI, XI, > POI, and PXI modes, and create XO and OO modes. > * gcc/config/rs6000/rs6000-string.c (expand_block_move): > Update to OO mode. > * gcc/config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached): > Update for XO/OO modes. > (rs6000_rtx_costs): Make UNSPEC_MMA_XXSETACCZ cost 0. > (rs6000_modes_tieable_p): Update for XO/OO modes. > (rs6000_debug_reg_global): Update for XO/OO modes. > (rs6000_setup_reg_addr_masks): Update for XO/OO modes. > (rs6000_init_hard_regno_mode_ok): Update for XO/OO modes. > (reg_offset_addressing_ok_p): Update for XO/OO modes. > (rs6000_emit_move): Update for XO/OO modes. > (rs6000_preferred_reload_class): Update for XO/OO modes. > (rs6000_split_multireg_move): Update for XO/OO modes. > (rs6000_mangle_type): Update for opaque types. > (rs6000_invalid_conversion): Update for XO/OO modes. > * gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): > Update for XO/OO modes. > * gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes. > gcc/testsuite/ > * gcc.target/powerpc/mma-double-test.c (main): Call abort for failure. > * gcc.target/powerpc/mma-single-test.c (main): Call abort for failure. > * gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c. > * gcc.target/powerpc/pr96506-2.c: New test. > --- > gcc/config/rs6000/mma.md | 421 ++++++++++-------- > gcc/config/rs6000/predicates.md | 12 + > gcc/config/rs6000/rs6000-builtin.def | 14 +- > gcc/config/rs6000/rs6000-call.c | 142 +++--- > gcc/config/rs6000/rs6000-modes.def | 10 +- > gcc/config/rs6000/rs6000-string.c | 6 +- > gcc/config/rs6000/rs6000.c | 193 ++++---- > gcc/config/rs6000/rs6000.h | 3 +- > gcc/config/rs6000/rs6000.md | 2 +- > .../gcc.target/powerpc/mma-double-test.c | 3 + > .../gcc.target/powerpc/mma-single-test.c | 3 + > .../powerpc/{pr96506.c => pr96506-1.c} | 24 - > gcc/testsuite/gcc.target/powerpc/pr96506-2.c | 38 ++ > 13 files changed, 508 insertions(+), 363 deletions(-) > rename gcc/testsuite/gcc.target/powerpc/{pr96506.c => pr96506-1.c} (61%) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c > > diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md > index a3fd28bdd0a..63bb73a01e7 100644 > --- a/gcc/config/rs6000/mma.md > +++ b/gcc/config/rs6000/mma.md > @@ -19,24 +19,18 @@ > ;; along with GCC; see the file COPYING3. If not see > ;; <http://www.gnu.org/licenses/>. > > -;; The MMA patterns use the multi-register PXImode and POImode partial > -;; integer modes to implement the target specific __vector_quad and > -;; __vector_pair types that the MMA built-in functions reference. > -;; To use these modes, we must define XImode and OImode move patterns > -;; so the independent parts of the compiler can use our large partial > -;; integer modes. However, if we enable the XImode and OImode move > -;; patterns, then the compiler will attempt to use them and this can > -;; cause byte swapping issues on litte-endian systems. We don't need > -;; the XImode and OImode move patterns for actual code generation, > -;; therefore, we define the XImode and OImode move patterns, but we > -;; disable their use with a "false" condition flag. > +;; The MMA patterns use the multi-register XOmode and OOmode opaque > +;; modes to implement the target specific __vector_quad and > +;; __vector_pair types that the MMA built-in functions reference. We > +;; use OPAQUE_MODE to prevent anything from trying to open them up. > > (define_constants [(MAX_MMA_OPERANDS 7)]) > > ;; Constants for creating unspecs > > (define_c_enum "unspec" > - [UNSPEC_MMA_ASSEMBLE_ACC > + [UNSPEC_MMA_ASSEMBLE > + UNSPEC_MMA_EXTRACT > UNSPEC_MMA_PMXVBF16GER2 > UNSPEC_MMA_PMXVBF16GER2NN > UNSPEC_MMA_PMXVBF16GER2NP > @@ -97,6 +91,7 @@ (define_c_enum "unspec" > UNSPEC_MMA_XVI8GER4SPP > UNSPEC_MMA_XXMFACC > UNSPEC_MMA_XXMTACC > + UNSPEC_MMA_XXSETACCZ > ]) > > ;; MMA instructions with 1 accumulator argument > @@ -265,31 +260,22 @@ (define_int_attr avvi4i4i4 > [(UNSPEC_MMA_PMXVI8GER4PP "pmxvi8ger4pp") > (UNSPEC_MMA_PMXVI8GER4SPP > "pmxvi8ger4spp")]) > > > -;; Define a disabled OImode move pattern, so we can use POImode. > -(define_expand "movoi" > - [(set (match_operand:OI 0 "nonimmediate_operand") > - (match_operand:OI 1 "input_operand"))] > - "0" > -{ > - gcc_unreachable (); > -}) > - > -;; Vector pair support. POImode can only live in VSRs. > -(define_expand "movpoi" > - [(set (match_operand:POI 0 "nonimmediate_operand") > - (match_operand:POI 1 "input_operand"))] > +;; Vector pair support. OOmode can only live in VSRs. > +(define_expand "movoo" > + [(set (match_operand:OO 0 "nonimmediate_operand") > + (match_operand:OO 1 "input_operand"))] > "TARGET_MMA" > { > - rs6000_emit_move (operands[0], operands[1], POImode); > + rs6000_emit_move (operands[0], operands[1], OOmode); > DONE; > }) > > -(define_insn_and_split "*movpoi" > - [(set (match_operand:POI 0 "nonimmediate_operand" "=wa,m,wa") > - (match_operand:POI 1 "input_operand" "m,wa,wa"))] > +(define_insn_and_split "*movoo" > + [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,m,wa") > + (match_operand:OO 1 "input_operand" "m,wa,wa"))] > "TARGET_MMA > - && (gpc_reg_operand (operands[0], POImode) > - || gpc_reg_operand (operands[1], POImode))" > + && (gpc_reg_operand (operands[0], OOmode) > + || gpc_reg_operand (operands[1], OOmode))" > "@ > lxvp%X1 %x0,%1 > stxvp%X0 %x1,%0 > @@ -305,287 +291,370 @@ (define_insn_and_split "*movpoi" > (set_attr "length" "*,*,8")]) > > > -;; Define a disabled XImode move pattern, so we can use PXImode. > -(define_expand "movxi" > - [(set (match_operand:XI 0 "nonimmediate_operand") > - (match_operand:XI 1 "input_operand"))] > - "0" > -{ > - gcc_unreachable (); > -}) > - > -;; Vector quad support. PXImode can only live in FPRs. > -(define_expand "movpxi" > - [(set (match_operand:PXI 0 "nonimmediate_operand") > - (match_operand:PXI 1 "input_operand"))] > +;; Vector quad support. XOmode can only live in FPRs. > +(define_expand "movxo" > + [(set (match_operand:XO 0 "nonimmediate_operand") > + (match_operand:XO 1 "input_operand"))] > "TARGET_MMA" > { > - rs6000_emit_move (operands[0], operands[1], PXImode); > + rs6000_emit_move (operands[0], operands[1], XOmode); > DONE; > }) > > -(define_insn_and_split "*movpxi" > - [(set (match_operand:PXI 0 "nonimmediate_operand" "=d,m,d,d") > - (match_operand:PXI 1 "input_operand" "m,d,d,O"))] > +(define_insn_and_split "*movxo" > + [(set (match_operand:XO 0 "nonimmediate_operand" "=d,m,d") > + (match_operand:XO 1 "input_operand" "m,d,d"))] > "TARGET_MMA > - && (gpc_reg_operand (operands[0], PXImode) > - || gpc_reg_operand (operands[1], PXImode))" > + && (gpc_reg_operand (operands[0], XOmode) > + || gpc_reg_operand (operands[1], XOmode))" > "@ > # > # > - # > - xxsetaccz %A0" > - "&& reload_completed > - && !(fpr_reg_operand (operands[0], PXImode) && operands[1] == const0_rtx)" > + #" > + "&& reload_completed" > [(const_int 0)] > { > rs6000_split_multireg_move (operands[0], operands[1]); > DONE; > } > - [(set_attr "type" "vecload,vecstore,veclogical,mma") > - (set_attr "length" "8,8,16,*") > - (set_attr "max_prefixed_insns" "2,2,*,*")]) > + [(set_attr "type" "vecload,vecstore,veclogical") > + (set_attr "length" "8,8,16") > + (set_attr "max_prefixed_insns" "2,2,*")]) > > (define_expand "mma_assemble_pair" > - [(match_operand:POI 0 "vsx_register_operand") > - (match_operand:V16QI 1 "input_operand") > - (match_operand:V16QI 2 "input_operand")] > + [(match_operand:OO 0 "vsx_register_operand") > + (match_operand:V16QI 1 "mma_assemble_input_operand") > + (match_operand:V16QI 2 "mma_assemble_input_operand")] > "TARGET_MMA" > { > - rtx dst; > + rtx src = gen_rtx_UNSPEC (OOmode, > + gen_rtvec (2, operands[1], operands[2]), > + UNSPEC_MMA_ASSEMBLE); > + emit_move_insn (operands[0], src); > + DONE; > +}) > > - /* Let the compiler know the code below fully defines our output value. */ > - emit_clobber (operands[0]); > +(define_insn_and_split "*mma_assemble_pair" > + [(set (match_operand:OO 0 "vsx_register_operand" "=wa") > + (unspec:OO [(match_operand:V16QI 1 "mma_assemble_input_operand" "mwa") > + (match_operand:V16QI 2 "mma_assemble_input_operand" "mwa")] > + UNSPEC_MMA_ASSEMBLE))] > + "TARGET_MMA" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + rtx src = gen_rtx_UNSPEC (OOmode, > + gen_rtvec (2, operands[1], operands[2]), > + UNSPEC_MMA_ASSEMBLE); > + rs6000_split_multireg_move (operands[0], src); > + DONE; > +}) > + > +(define_expand "mma_disassemble_pair" > + [(match_operand:V16QI 0 "mma_disassemble_output_operand") > + (match_operand:OO 1 "input_operand") > + (match_operand 2 "const_0_to_1_operand")] > + "TARGET_MMA" > +{ > + rtx src; > + int regoff = INTVAL (operands[2]); > + src = gen_rtx_UNSPEC (V16QImode, > + gen_rtvec (2, operands[1], GEN_INT (regoff)), > + UNSPEC_MMA_EXTRACT); > + emit_move_insn (operands[0], src); > + DONE; > +}) > > - dst = simplify_gen_subreg (V16QImode, operands[0], POImode, 0); > - emit_move_insn (dst, operands[1]); > - dst = simplify_gen_subreg (V16QImode, operands[0], POImode, 16); > - emit_move_insn (dst, operands[2]); > +(define_insn_and_split "*mma_disassemble_pair" > + [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa") > + (unspec:V16QI [(match_operand:OO 1 "input_operand" "wa") > + (match_operand 2 "const_0_to_1_operand")] > + UNSPEC_MMA_EXTRACT))] > + "TARGET_MMA > + && fpr_reg_operand (operands[1], OOmode)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + int reg = REGNO (operands[1]); > + int regoff = INTVAL (operands[2]); > + rtx src = gen_rtx_REG (V16QImode, reg + regoff); > + emit_move_insn (operands[0], src); > DONE; > }) > > (define_expand "mma_assemble_acc" > - [(match_operand:PXI 0 "fpr_reg_operand") > - (match_operand:V16QI 1 "input_operand") > - (match_operand:V16QI 2 "input_operand") > - (match_operand:V16QI 3 "input_operand") > - (match_operand:V16QI 4 "input_operand")] > + [(match_operand:XO 0 "fpr_reg_operand") > + (match_operand:V16QI 1 "mma_assemble_input_operand") > + (match_operand:V16QI 2 "mma_assemble_input_operand") > + (match_operand:V16QI 3 "mma_assemble_input_operand") > + (match_operand:V16QI 4 "mma_assemble_input_operand")] > "TARGET_MMA" > { > - rtx src = gen_rtx_UNSPEC (PXImode, > + rtx src = gen_rtx_UNSPEC (XOmode, > gen_rtvec (4, operands[1], operands[2], > operands[3], operands[4]), > - UNSPEC_MMA_ASSEMBLE_ACC); > + UNSPEC_MMA_ASSEMBLE); > emit_move_insn (operands[0], src); > DONE; > }) > > (define_insn_and_split "*mma_assemble_acc" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=d") > - (unspec:PXI [(match_operand:V16QI 1 "mma_assemble_input_operand" "mwa") > - (match_operand:V16QI 2 "mma_assemble_input_operand" "mwa") > - (match_operand:V16QI 3 "mma_assemble_input_operand" "mwa") > - (match_operand:V16QI 4 "mma_assemble_input_operand" "mwa")] > - UNSPEC_MMA_ASSEMBLE_ACC))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=d") > + (unspec:XO [(match_operand:V16QI 1 "mma_assemble_input_operand" "mwa") > + (match_operand:V16QI 2 "mma_assemble_input_operand" "mwa") > + (match_operand:V16QI 3 "mma_assemble_input_operand" "mwa") > + (match_operand:V16QI 4 "mma_assemble_input_operand" "mwa")] > + UNSPEC_MMA_ASSEMBLE))] > "TARGET_MMA > - && fpr_reg_operand (operands[0], PXImode)" > + && fpr_reg_operand (operands[0], XOmode)" > "#" > "&& reload_completed" > [(const_int 0)] > { > - rtx src = gen_rtx_UNSPEC (PXImode, > + rtx src = gen_rtx_UNSPEC (XOmode, > gen_rtvec (4, operands[1], operands[2], > operands[3], operands[4]), > - UNSPEC_MMA_ASSEMBLE_ACC); > + UNSPEC_MMA_ASSEMBLE); > rs6000_split_multireg_move (operands[0], src); > DONE; > }) > > +(define_expand "mma_disassemble_acc" > + [(match_operand:V16QI 0 "mma_disassemble_output_operand") > + (match_operand:XO 1 "input_operand") > + (match_operand 2 "const_0_to_3_operand")] > + "TARGET_MMA" > +{ > + rtx src; > + int regoff = INTVAL (operands[2]); > + src = gen_rtx_UNSPEC (V16QImode, > + gen_rtvec (2, operands[1], GEN_INT (regoff)), > + UNSPEC_MMA_EXTRACT); > + emit_move_insn (operands[0], src); > + DONE; > +}) > + > +(define_insn_and_split "*mma_disassemble_acc" > + [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa") > + (unspec:V16QI [(match_operand:XO 1 "input_operand" "d") > + (match_operand 2 "const_0_to_3_operand")] > + UNSPEC_MMA_EXTRACT))] > + "TARGET_MMA > + && fpr_reg_operand (operands[1], XOmode)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + int reg = REGNO (operands[1]); > + int regoff = INTVAL (operands[2]); > + rtx src = gen_rtx_REG (V16QImode, reg + regoff); > + emit_move_insn (operands[0], src); > + DONE; > +}) > + > ;; MMA instructions that do not use their accumulators as an input, still > ;; must not allow their vector operands to overlap the registers used by > ;; the accumulator. We enforce this by marking the output as early clobber. > > (define_insn "mma_<acc>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0")] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")] > MMA_ACC))] > "TARGET_MMA" > "<acc> %A0" > [(set_attr "type" "mma")]) > > +;; We can't have integer constants in XOmode so we wrap this in an UNSPEC. > + > (define_expand "mma_xxsetaccz" > - [(set (match_operand:PXI 0 "fpr_reg_operand") > + [(set (match_operand:XO 0 "fpr_reg_operand") > (const_int 0))] > "TARGET_MMA" > { > - emit_insn (gen_movpxi (operands[0], const0_rtx)); > + rtx xo0 = gen_rtx_UNSPEC (XOmode, gen_rtvec (1, const0_rtx), > + UNSPEC_MMA_XXSETACCZ); > + emit_insn (gen_rtx_SET (operands[0], xo0)); > DONE; > }) > > +(define_insn_and_split "*mma_xxsetaccz" > + [(set (match_operand:XO 0 "fpr_reg_operand" "=d") > + (unspec:XO [(match_operand 1 "const_0_to_1_operand" "O")] > + UNSPEC_MMA_XXSETACCZ))] > + "TARGET_MMA" > + "xxsetaccz %A0" > + "&& reload_completed" > + [(set (match_dup 0) (unspec:XO [(match_dup 1)] UNSPEC_MMA_XXSETACCZ))] > + "" > + [(set_attr "type" "mma") > + (set_attr "length" "4")]) > + > (define_insn "mma_<vv>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa")] > - MMA_VV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa")] > + MMA_VV))] > "TARGET_MMA" > "<vv> %A0,%x1,%x2" > [(set_attr "type" "mma")]) > > (define_insn "mma_<avv>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa")] > - MMA_AVV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa")] > + MMA_AVV))] > "TARGET_MMA" > "<avv> %A0,%x2,%x3" > [(set_attr "type" "mma")]) > > (define_insn "mma_<pv>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:POI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa")] > - MMA_PV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa")] > + MMA_PV))] > "TARGET_MMA" > "<pv> %A0,%x1,%x2" > [(set_attr "type" "mma")]) > > (define_insn "mma_<apv>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:POI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa")] > - MMA_APV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:OO 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa")] > + MMA_APV))] > "TARGET_MMA" > "<apv> %A0,%x2,%x3" > [(set_attr "type" "mma")]) > > (define_insn "mma_<vvi4i4i8>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "u8bit_cint_operand" "n")] > - MMA_VVI4I4I8))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "u8bit_cint_operand" "n")] > + MMA_VVI4I4I8))] > "TARGET_MMA" > "<vvi4i4i8> %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<avvi4i4i8>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "u8bit_cint_operand" "n")] > - MMA_AVVI4I4I8))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "u8bit_cint_operand" "n")] > + MMA_AVVI4I4I8))] > "TARGET_MMA" > "<avvi4i4i8> %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<vvi4i4i2>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_3_operand" "n")] > - MMA_VVI4I4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_3_operand" "n")] > + MMA_VVI4I4I2))] > "TARGET_MMA" > "<vvi4i4i2> %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<avvi4i4i2>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "const_0_to_3_operand" "n")] > - MMA_AVVI4I4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "const_0_to_3_operand" "n")] > + MMA_AVVI4I4I2))] > "TARGET_MMA" > "<avvi4i4i2> %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<vvi4i4>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n")] > - MMA_VVI4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n")] > + MMA_VVI4I4))] > "TARGET_MMA" > "<vvi4i4> %A0,%x1,%x2,%3,%4" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<avvi4i4>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n")] > - MMA_AVVI4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n")] > + MMA_AVVI4I4))] > "TARGET_MMA" > "<avvi4i4> %A0,%x2,%x3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<pvi4i2>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:POI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_3_operand" "n")] > - MMA_PVI4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_3_operand" "n")] > + MMA_PVI4I2))] > "TARGET_MMA" > "<pvi4i2> %A0,%x1,%x2,%3,%4" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<apvi4i2>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:POI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_3_operand" "n")] > - MMA_APVI4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:OO 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_3_operand" "n")] > + MMA_APVI4I2))] > "TARGET_MMA" > "<apvi4i2> %A0,%x2,%x3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<vvi4i4i4>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n")] > - MMA_VVI4I4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n")] > + MMA_VVI4I4I4))] > "TARGET_MMA" > "<vvi4i4i4> %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) > > (define_insn "mma_<avvi4i4i4>" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "const_0_to_15_operand" "n")] > - MMA_AVVI4I4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "const_0_to_15_operand" "n")] > + MMA_AVVI4I4I4))] > "TARGET_MMA" > "<avvi4i4i4> %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index 4c2fe7fa312..9ad5ae67302 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -1144,6 +1144,18 @@ (define_special_predicate "mma_assemble_input_operand" > (match_test "(mode == V16QImode > && (vsx_register_operand (op, mode) || MEM_P (op)))")) > > +;; Return 1 if this operand is valid for an MMA disassemble insn. > +(define_predicate "mma_disassemble_output_operand" > + (match_code "reg,subreg,mem") > +{ > + if (SUBREG_P (op)) > + op = SUBREG_REG (op); > + if (!REG_P (op)) > + return true; > + > + return vsx_register_operand (op, mode); > +}) > + > ;; Return true if operand is an operator used in rotate-and-mask instructions. > (define_predicate "rotate_mask_operator" > (match_code "rotate,ashift,lshiftrt")) > diff --git a/gcc/config/rs6000/rs6000-builtin.def > b/gcc/config/rs6000/rs6000-builtin.def > index a58102c3785..47b1f74e616 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -352,7 +352,7 @@ > | RS6000_BTC_UNARY), \ > CODE_FOR_ ## ICODE) /* ICODE */ > > -#define BU_MMA_V2(ENUM, NAME, ATTR, ICODE) \ > +#define BU_MMA_2(ENUM, NAME, ATTR, ICODE) \ > RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM, /* ENUM */ \ > "__builtin_mma_" NAME, /* NAME */ \ > RS6000_BTM_MMA, /* MASK */ \ > @@ -360,7 +360,13 @@ > | RS6000_BTC_BINARY \ > | RS6000_BTC_VOID \ > | RS6000_BTC_GIMPLE), \ > - CODE_FOR_nothing) /* ICODE */ > + CODE_FOR_nothing) /* ICODE */ \ > + RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM ## _INTERNAL, /* ENUM */ > \ > + "__builtin_mma_" NAME "_internal", /* NAME */ \ > + RS6000_BTM_MMA, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_BINARY), \ > + CODE_FOR_ ## ICODE) /* ICODE */ > > #define BU_MMA_3(ENUM, NAME, ATTR, ICODE) \ > RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM, /* ENUM */ \ > @@ -3108,8 +3114,8 @@ BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, > mma_xxmfacc) > BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) > BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) > > -BU_MMA_V2 (DISASSEMBLE_ACC, "disassemble_acc", QUAD, nothing) > -BU_MMA_V2 (DISASSEMBLE_PAIR,"disassemble_pair", PAIR, nothing) > +BU_MMA_2 (DISASSEMBLE_ACC, "disassemble_acc", QUAD, > mma_disassemble_acc) > +BU_MMA_2 (DISASSEMBLE_PAIR,"disassemble_pair", PAIR, > mma_disassemble_pair) > > BU_MMA_3 (ASSEMBLE_PAIR, "assemble_pair", MISC, mma_assemble_pair) > BU_MMA_3 (XVBF16GER2, "xvbf16ger2", MISC, mma_xvbf16ger2) > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index 3bd89a79bad..ca0c75778a9 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -6325,6 +6325,22 @@ rs6000_discover_homogeneous_aggregate (machine_mode > mode, const_tree type, > bool > rs6000_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED) > { > + /* We do not allow MMA types being used as return values. Only report > + the invalid return value usage the first time we encounter it. */ > + if (cfun > + && !cfun->machine->mma_return_type_error > + && TREE_TYPE (cfun->decl) == fntype > + && (TYPE_MODE (type) == OOmode || TYPE_MODE (type) == XOmode)) > + { > + /* Record we have now handled function CFUN, so the next time we > + are called, we do not re-report the same error. */ > + cfun->machine->mma_return_type_error = true; > + if (TYPE_CANONICAL (type) != NULL_TREE) > + type = TYPE_CANONICAL (type); > + error ("invalid use of MMA type %qs as a function return value", > + IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)))); > + } > + > /* For the Darwin64 ABI, test if we can fit the return value in regs. */ > if (TARGET_MACHO > && rs6000_darwin64_abi > @@ -6577,30 +6593,8 @@ machine_mode > rs6000_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, > machine_mode mode, > int *punsignedp ATTRIBUTE_UNUSED, > - const_tree, int for_return) > + const_tree, int for_return ATTRIBUTE_UNUSED) > { > - /* Warning: this is a static local variable and not always NULL! > - This function is called multiple times for the same function > - and return value. PREV_FUNC is used to keep track of the > - first time we encounter a function's return value in order > - to not report an error with that return value multiple times. */ > - static struct function *prev_func = NULL; > - > - /* We do not allow MMA types being used as return values. Only report > - the invalid return value usage the first time we encounter it. */ > - if (for_return > - && prev_func != cfun > - && (mode == POImode || mode == PXImode)) > - { > - /* Record we have now handled function CFUN, so the next time we > - are called, we do not re-report the same error. */ > - prev_func = cfun; > - if (TYPE_CANONICAL (type) != NULL_TREE) > - type = TYPE_CANONICAL (type); > - error ("invalid use of MMA type %qs as a function return value", > - IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)))); > - } > - > PROMOTE_MODE (mode, *punsignedp, type); > > return mode; > @@ -7552,7 +7546,7 @@ rs6000_function_arg (cumulative_args_t cum_v, const > function_arg_info &arg) > int n_elts; > > /* We do not allow MMA types being used as function arguments. */ > - if (mode == POImode || mode == PXImode) > + if (mode == OOmode || mode == XOmode) > { > if (TYPE_CANONICAL (type) != NULL_TREE) > type = TYPE_CANONICAL (type); > @@ -10073,7 +10067,8 @@ mma_expand_builtin (tree exp, rtx target, bool > *expandedp) > } > > unsigned attr_args = attr & RS6000_BTC_OPND_MASK; > - if (attr & RS6000_BTC_QUAD) > + if (attr & RS6000_BTC_QUAD > + || fcode == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > attr_args++; > > gcc_assert (nopnds == attr_args); > @@ -11687,23 +11682,24 @@ rs6000_gimple_fold_mma_builtin > (gimple_stmt_iterator *gsi) > gimple *new_call; > tree new_decl; > > - if (rs6000_builtin_info[fncode + 1].icode == CODE_FOR_nothing) > + if (fncode == MMA_BUILTIN_DISASSEMBLE_ACC > + || fncode == MMA_BUILTIN_DISASSEMBLE_PAIR) > { > /* This is an MMA disassemble built-in function. */ > - gcc_assert (fncode == MMA_BUILTIN_DISASSEMBLE_ACC > - || fncode == MMA_BUILTIN_DISASSEMBLE_PAIR); > - > push_gimplify_context (true); > + unsigned nvec = (fncode == MMA_BUILTIN_DISASSEMBLE_ACC) ? 4 : 2; > tree dst_ptr = gimple_call_arg (stmt, 0); > tree src_ptr = gimple_call_arg (stmt, 1); > tree src_type = TREE_TYPE (src_ptr); > tree src = make_ssa_name (TREE_TYPE (src_type)); > gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq); > > - /* If we are not disassembling an accumulator or our destination is > - another accumulator, then just copy the entire thing as is. */ > - if (fncode != MMA_BUILTIN_DISASSEMBLE_ACC > - || TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node) > + /* If we are not disassembling an accumulator/pair or our destination > is > + another accumulator/pair, then just copy the entire thing as is. */ > + if ((fncode == MMA_BUILTIN_DISASSEMBLE_ACC > + && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node) > + || (fncode == MMA_BUILTIN_DISASSEMBLE_PAIR > + && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_pair_type_node)) > { > tree dst = build_simple_mem_ref (build1 (VIEW_CONVERT_EXPR, > src_type, dst_ptr)); > @@ -11713,29 +11709,33 @@ rs6000_gimple_fold_mma_builtin > (gimple_stmt_iterator *gsi) > return true; > } > > - /* We're disassembling an accumulator into a different type, so we need > + /* If we're disassembling an accumulator into a different type, we need > to emit a xxmfacc instruction now, since we cannot do it later. */ > - new_decl = rs6000_builtin_decls[MMA_BUILTIN_XXMFACC_INTERNAL]; > - new_call = gimple_build_call (new_decl, 1, src); > - src = make_ssa_name (vector_quad_type_node); > - gimple_call_set_lhs (new_call, src); > - gimple_seq_add_stmt (&new_seq, new_call); > + if (fncode == MMA_BUILTIN_DISASSEMBLE_ACC) > + { > + new_decl = rs6000_builtin_decls[MMA_BUILTIN_XXMFACC_INTERNAL]; > + new_call = gimple_build_call (new_decl, 1, src); > + src = make_ssa_name (vector_quad_type_node); > + gimple_call_set_lhs (new_call, src); > + gimple_seq_add_stmt (&new_seq, new_call); > + } > > - /* Copy the accumulator vector by vector. */ > + /* Copy the accumulator/pair vector by vector. */ > + new_decl = rs6000_builtin_decls[fncode + 1]; > tree dst_type = build_pointer_type_for_mode (unsigned_V16QI_type_node, > ptr_mode, true); > tree dst_base = build1 (VIEW_CONVERT_EXPR, dst_type, dst_ptr); > - tree array_type = build_array_type_nelts (unsigned_V16QI_type_node, 4); > - tree src_array = build1 (VIEW_CONVERT_EXPR, array_type, src); > - for (unsigned i = 0; i < 4; i++) > + for (unsigned i = 0; i < nvec; i++) > { > - unsigned index = WORDS_BIG_ENDIAN ? i : 3 - i; > - tree ref = build4 (ARRAY_REF, unsigned_V16QI_type_node, src_array, > - build_int_cst (size_type_node, i), > - NULL_TREE, NULL_TREE); > + unsigned index = WORDS_BIG_ENDIAN ? i : nvec - 1 - i; > tree dst = build2 (MEM_REF, unsigned_V16QI_type_node, dst_base, > build_int_cst (dst_type, index * 16)); > - gimplify_assign (dst, ref, &new_seq); > + tree dstssa = make_ssa_name (unsigned_V16QI_type_node); > + new_call = gimple_build_call (new_decl, 2, src, > + build_int_cstu (uint16_type_node, i)); > + gimple_call_set_lhs (new_call, dstssa); > + gimple_seq_add_stmt (&new_seq, new_call); > + gimplify_assign (dst, dstssa, &new_seq); > } > pop_gimplify_context (NULL); > gsi_replace_with_seq (gsi, new_seq, true); > @@ -13190,17 +13190,23 @@ rs6000_init_builtins (void) > /* Vector pair and vector quad support. */ > if (TARGET_EXTRA_BUILTINS) > { > - vector_pair_type_node = make_unsigned_type (256); > + vector_pair_type_node = make_node (OPAQUE_TYPE); > + SET_TYPE_MODE (vector_pair_type_node, OOmode); > + TYPE_SIZE (vector_pair_type_node) = bitsize_int (GET_MODE_BITSIZE > (OOmode)); > + TYPE_PRECISION (vector_pair_type_node) = GET_MODE_BITSIZE (OOmode); > + TYPE_SIZE_UNIT (vector_pair_type_node) = size_int (GET_MODE_SIZE > (OOmode)); > SET_TYPE_ALIGN (vector_pair_type_node, 256); > - SET_TYPE_MODE (vector_pair_type_node, POImode); > - layout_type (vector_pair_type_node); > + TYPE_USER_ALIGN (vector_pair_type_node) = 0; > lang_hooks.types.register_builtin_type (vector_pair_type_node, > "__vector_pair"); > > - vector_quad_type_node = make_unsigned_type (512); > + vector_quad_type_node = make_node (OPAQUE_TYPE); > + SET_TYPE_MODE (vector_quad_type_node, XOmode); > + TYPE_SIZE (vector_quad_type_node) = bitsize_int (GET_MODE_BITSIZE > (XOmode)); > + TYPE_PRECISION (vector_quad_type_node) = GET_MODE_BITSIZE (XOmode); > + TYPE_SIZE_UNIT (vector_quad_type_node) = size_int (GET_MODE_SIZE > (XOmode)); > SET_TYPE_ALIGN (vector_quad_type_node, 512); > - SET_TYPE_MODE (vector_quad_type_node, PXImode); > - layout_type (vector_quad_type_node); > + TYPE_USER_ALIGN (vector_quad_type_node) = 0; > lang_hooks.types.register_builtin_type (vector_quad_type_node, > "__vector_quad"); > } > @@ -13236,8 +13242,8 @@ rs6000_init_builtins (void) > builtin_mode_to_type[V8HImode][1] = unsigned_V8HI_type_node; > builtin_mode_to_type[V16QImode][0] = V16QI_type_node; > builtin_mode_to_type[V16QImode][1] = unsigned_V16QI_type_node; > - builtin_mode_to_type[POImode][1] = vector_pair_type_node; > - builtin_mode_to_type[PXImode][1] = vector_quad_type_node; > + builtin_mode_to_type[OOmode][1] = vector_pair_type_node; > + builtin_mode_to_type[XOmode][1] = vector_quad_type_node; > > tdecl = add_builtin_type ("__bool char", bool_char_type_node); > TYPE_NAME (bool_char_type_node) = tdecl; > @@ -14049,21 +14055,21 @@ mma_init_builtins (void) > } > else > { > - if ((attr & RS6000_BTC_QUAD) == 0) > + if ( !(d->code == MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL > + || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > + && (attr & RS6000_BTC_QUAD) == 0) > attr_args--; > > /* Ensure we have the correct number and type of operands. */ > gcc_assert (attr_args == insn_data[icode].n_operands - 1); > } > > - if (icode == CODE_FOR_nothing) > + /* This is a disassemble pair/acc function. */ > + if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC > + || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR) > { > - /* This is a disassemble MMA built-in function. */ > - gcc_assert (attr_args == RS6000_BTC_BINARY > - && (d->code == MMA_BUILTIN_DISASSEMBLE_ACC > - || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR)); > op[nopnds++] = build_pointer_type (void_type_node); > - if (attr & RS6000_BTC_QUAD) > + if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC) > op[nopnds++] = build_pointer_type (vector_quad_type_node); > else > op[nopnds++] = build_pointer_type (vector_pair_type_node); > @@ -14071,13 +14077,17 @@ mma_init_builtins (void) > else > { > /* This is a normal MMA built-in function. */ > - unsigned j = (attr & RS6000_BTC_QUAD) ? 1 : 0; > + unsigned j = 0; > + if (attr & RS6000_BTC_QUAD > + && d->code != MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL > + && d->code != MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > + j = 1; > for (; j < (unsigned) insn_data[icode].n_operands; j++) > { > machine_mode mode = insn_data[icode].operand[j].mode; > - if (gimple_func && mode == PXImode) > + if (gimple_func && mode == XOmode) > op[nopnds++] = build_pointer_type (vector_quad_type_node); > - else if (gimple_func && mode == POImode > + else if (gimple_func && mode == OOmode > && d->code == MMA_BUILTIN_ASSEMBLE_PAIR) > op[nopnds++] = build_pointer_type (vector_pair_type_node); > else > diff --git a/gcc/config/rs6000/rs6000-modes.def > b/gcc/config/rs6000/rs6000-modes.def > index ddb218b3fba..e81a32c8c36 100644 > --- a/gcc/config/rs6000/rs6000-modes.def > +++ b/gcc/config/rs6000/rs6000-modes.def > @@ -83,12 +83,6 @@ VECTOR_MODE (INT, SI, 2); /* V2SI */ > combination. */ > PARTIAL_INT_MODE (TI, 128, PTI); > > -/* Define, but don't use the larger integer modes. We need an integer mode > - defined that is the same size as the vector pair and vector quad modes. > */ > - > -INT_MODE (OI, 32); > -INT_MODE (XI, 64); > - > /* Modes used by __vector_pair and __vector_quad. */ > -PARTIAL_INT_MODE (OI, 256, POI); /* __vector_pair. */ > -PARTIAL_INT_MODE (XI, 512, PXI); /* __vector_quad. */ > +OPAQUE_MODE (OO, 32); > +OPAQUE_MODE (XO, 64); > diff --git a/gcc/config/rs6000/rs6000-string.c > b/gcc/config/rs6000/rs6000-string.c > index 82cc24ecdda..a2e6821d353 100644 > --- a/gcc/config/rs6000/rs6000-string.c > +++ b/gcc/config/rs6000/rs6000-string.c > @@ -2787,7 +2787,7 @@ expand_block_move (rtx operands[], bool might_overlap) > rtx src, dest; > bool move_with_length = false; > > - /* Use POImode for paired vsx load/store. Use V2DI for single > + /* Use OOmode for paired vsx load/store. Use V2DI for single > unaligned vsx load/store, for consistency with what other > expansions (compare) already do, and so we can use lxvd2x on > p8. Order is VSX pair unaligned, VSX unaligned, Altivec, VSX > @@ -2799,8 +2799,8 @@ expand_block_move (rtx operands[], bool might_overlap) > && (align >= 256 || !STRICT_ALIGNMENT)) > { > move_bytes = 32; > - mode = POImode; > - gen_func.mov = gen_movpoi; > + mode = OOmode; > + gen_func.mov = gen_movoo; > } > else if (TARGET_POWERPC64 && TARGET_BLOCK_OPS_UNALIGNED_VSX > && VECTOR_MEM_VSX_P (V2DImode) > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index d7dcd93f088..bd8205c87f7 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -1826,15 +1826,12 @@ rs6000_hard_regno_mode_ok_uncached (int regno, > machine_mode mode) > mode = GET_MODE_INNER (mode); > > /* Vector pair modes need even/odd VSX register pairs. Only allow vector > - registers. We need to allow OImode to have the same registers as > POImode, > - even though we do not enable the move pattern for OImode. */ > - if (mode == POImode || mode == OImode) > + registers. */ > + if (mode == OOmode) > return (TARGET_MMA && VSX_REGNO_P (regno) && (regno & 1) == 0); > > - /* MMA accumulator modes need FPR registers divisible by 4. We need to > allow > - XImode to have the same registers as PXImode, even though we do not > enable > - the move pattern for XImode. */ > - if (mode == PXImode || mode == XImode) > + /* MMA accumulator modes need FPR registers divisible by 4. */ > + if (mode == XOmode) > return (TARGET_MMA && FP_REGNO_P (regno) && (regno & 3) == 0); > > /* PTImode can only go in GPRs. Quad word memory operations require > even/odd > @@ -1941,8 +1938,8 @@ rs6000_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > GPR registers, and TImode can go in any GPR as well as VSX registers (PR > 57744). > > - Similarly, don't allow POImode (vector pair, restricted to even VSX > - registers) or PXImode (vector quad, restricted to FPR registers divisible > + Similarly, don't allow OOmode (vector pair, restricted to even VSX > + registers) or XOmode (vector quad, restricted to FPR registers divisible > by 4) to tie with other modes. > > Altivec/VSX vector tests were moved ahead of scalar float mode, so that > IEEE > @@ -1951,8 +1948,8 @@ rs6000_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > static bool > rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2) > { > - if (mode1 == PTImode || mode1 == POImode || mode1 == PXImode > - || mode2 == PTImode || mode2 == POImode || mode2 == PXImode) > + if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode > + || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode) > return mode1 == mode2; > > if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1)) > @@ -2241,10 +2238,8 @@ rs6000_debug_reg_global (void) > V2DFmode, > V8SFmode, > V4DFmode, > - OImode, > - XImode, > - POImode, > - PXImode, > + OOmode, > + XOmode, > CCmode, > CCUNSmode, > CCEQmode, > @@ -2706,13 +2701,13 @@ rs6000_setup_reg_addr_masks (void) > since it will be broken into two vector moves. Vector quads can > only do offset loads. */ > else if ((addr_mask != 0) && TARGET_MMA > - && (m2 == POImode || m2 == PXImode)) > + && (m2 == OOmode || m2 == XOmode)) > { > addr_mask |= RELOAD_REG_OFFSET; > if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX) > { > addr_mask |= RELOAD_REG_QUAD_OFFSET; > - if (m2 == POImode) > + if (m2 == OOmode) > addr_mask |= RELOAD_REG_INDEXED; > } > } > @@ -2921,13 +2916,13 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) > /* Add support for vector pairs and vector quad registers. */ > if (TARGET_MMA) > { > - rs6000_vector_unit[POImode] = VECTOR_NONE; > - rs6000_vector_mem[POImode] = VECTOR_VSX; > - rs6000_vector_align[POImode] = 256; > + rs6000_vector_unit[OOmode] = VECTOR_NONE; > + rs6000_vector_mem[OOmode] = VECTOR_VSX; > + rs6000_vector_align[OOmode] = 256; > > - rs6000_vector_unit[PXImode] = VECTOR_NONE; > - rs6000_vector_mem[PXImode] = VECTOR_VSX; > - rs6000_vector_align[PXImode] = 512; > + rs6000_vector_unit[XOmode] = VECTOR_NONE; > + rs6000_vector_mem[XOmode] = VECTOR_VSX; > + rs6000_vector_align[XOmode] = 512; > } > > /* Register class constraints for the constraints that depend on compile > @@ -3064,10 +3059,10 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) > > if (TARGET_MMA) > { > - reg_addr[POImode].reload_store = CODE_FOR_reload_poi_di_store; > - reg_addr[POImode].reload_load = CODE_FOR_reload_poi_di_load; > - reg_addr[PXImode].reload_store = CODE_FOR_reload_pxi_di_store; > - reg_addr[PXImode].reload_load = CODE_FOR_reload_pxi_di_load; > + reg_addr[OOmode].reload_store = CODE_FOR_reload_oo_di_store; > + reg_addr[OOmode].reload_load = CODE_FOR_reload_oo_di_load; > + reg_addr[XOmode].reload_store = CODE_FOR_reload_xo_di_store; > + reg_addr[XOmode].reload_load = CODE_FOR_reload_xo_di_load; > } > } > } > @@ -8129,8 +8124,8 @@ reg_offset_addressing_ok_p (machine_mode mode) > > /* The vector pair/quad types support offset addressing if the > underlying vectors support offset addressing. */ > - case E_POImode: > - case E_PXImode: > + case E_OOmode: > + case E_XOmode: > return TARGET_MMA; > > case E_SDmode: > @@ -10323,11 +10318,11 @@ rs6000_emit_move (rtx dest, rtx source, > machine_mode mode) > operands[1] = force_const_mem (mode, operands[1]); > break; > > - case E_POImode: > - case E_PXImode: > + case E_OOmode: > + case E_XOmode: > if (CONST_INT_P (operands[1]) && INTVAL (operands[1]) != 0) > error ("%qs is an opaque type, and you can't set it to other values.", > - (mode == POImode) ? "__vector_pair" : "__vector_quad"); > + (mode == OOmode) ? "__vector_pair" : "__vector_quad"); > break; > > case E_SImode: > @@ -12596,10 +12591,10 @@ rs6000_preferred_reload_class (rtx x, enum > reg_class rclass) > the GPR registers. */ > if (rclass == GEN_OR_FLOAT_REGS) > { > - if (mode == POImode) > + if (mode == OOmode) > return VSX_REGS; > > - if (mode == PXImode) > + if (mode == XOmode) > return FLOAT_REGS; > > if (GET_MODE_CLASS (mode) == MODE_INT) > @@ -16323,15 +16318,15 @@ rs6000_split_multireg_move (rtx dst, rtx src) > > /* If we have a vector quad register for MMA, and this is a load or store, > see if we can use vector paired load/stores. */ > - if (mode == PXImode && TARGET_MMA > + if (mode == XOmode && TARGET_MMA > && (MEM_P (dst) || MEM_P (src))) > { > - reg_mode = POImode; > + reg_mode = OOmode; > nregs /= 2; > } > /* If we have a vector pair/quad mode, split it into two/four separate > vectors. */ > - else if (mode == POImode || mode == PXImode) > + else if (mode == OOmode || mode == XOmode) > reg_mode = V1TImode; > else if (FP_REGNO_P (reg)) > reg_mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode : > @@ -16377,12 +16372,16 @@ rs6000_split_multireg_move (rtx dst, rtx src) > return; > } > > - /* The __vector_pair and __vector_quad modes are multi-register modes, > - so if have to load or store the registers, we have to be careful to > - properly swap them if we're in little endian mode below. This means > - the last register gets the first memory location. */ > - if (mode == POImode || mode == PXImode) > + /* The __vector_pair and __vector_quad modes are multi-register > + modes, so if we have to load or store the registers, we have to be > + careful to properly swap them if we're in little endian mode > + below. This means the last register gets the first memory > + location. We also need to be careful of using the right register > + numbers if we are splitting XO to OO. */ > + if (mode == OOmode || mode == XOmode) > { > + nregs = hard_regno_nregs (reg, mode); > + int reg_mode_nregs = hard_regno_nregs (reg, reg_mode); > if (MEM_P (dst)) > { > unsigned offset = 0; > @@ -16391,15 +16390,15 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA > - && GET_MODE (src) == PXImode && FP_REGNO_P (REGNO (src))) > + && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) > emit_insn (gen_mma_xxmfacc (src, src)); > > - for (int i = 0; i < nregs; i++) > + for (int i = 0; i < nregs; i += reg_mode_nregs) > { > - unsigned subreg = (WORDS_BIG_ENDIAN) > - ? i * size : (nregs - 1 - i) * size; > + unsigned subreg = > + (WORDS_BIG_ENDIAN) ? i : (nregs - reg_mode_nregs - i); > rtx dst2 = adjust_address (dst, reg_mode, offset); > - rtx src2 = simplify_gen_subreg (reg_mode, src, mode, subreg); > + rtx src2 = gen_rtx_REG (reg_mode, reg + subreg); > offset += size; > emit_insn (gen_rtx_SET (dst2, src2)); > } > @@ -16412,11 +16411,11 @@ rs6000_split_multireg_move (rtx dst, rtx src) > unsigned offset = 0; > unsigned size = GET_MODE_SIZE (reg_mode); > > - for (int i = 0; i < nregs; i++) > + for (int i = 0; i < nregs; i += reg_mode_nregs) > { > - unsigned subreg = (WORDS_BIG_ENDIAN) > - ? i * size : (nregs - 1 - i) * size; > - rtx dst2 = simplify_gen_subreg (reg_mode, dst, mode, subreg); > + unsigned subreg = > + (WORDS_BIG_ENDIAN) ? i : (nregs - reg_mode_nregs - i); > + rtx dst2 = gen_rtx_REG (reg_mode, reg + subreg); > rtx src2 = adjust_address (src, reg_mode, offset); > offset += size; > emit_insn (gen_rtx_SET (dst2, src2)); > @@ -16425,7 +16424,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA > - && GET_MODE (dst) == PXImode && FP_REGNO_P (REGNO (dst))) > + && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); > > return; > @@ -16433,9 +16432,12 @@ rs6000_split_multireg_move (rtx dst, rtx src) > > if (GET_CODE (src) == UNSPEC) > { > - gcc_assert (REG_P (dst) > - && FP_REGNO_P (REGNO (dst)) > - && XINT (src, 1) == UNSPEC_MMA_ASSEMBLE_ACC); > + gcc_assert (XINT (src, 1) == UNSPEC_MMA_ASSEMBLE); > + gcc_assert (REG_P (dst)); > + if (GET_MODE (src) == XOmode) > + gcc_assert (FP_REGNO_P (REGNO (dst))); > + if (GET_MODE (src) == OOmode) > + gcc_assert (VSX_REGNO_P (REGNO (dst))); > > reg_mode = GET_MODE (XVECEXP (src, 0, 0)); > for (int i = 0; i < XVECLEN (src, 0); i++) > @@ -16446,7 +16448,8 @@ rs6000_split_multireg_move (rtx dst, rtx src) > > /* We are writing an accumulator register, so we have to > prime it after we've written it. */ > - emit_insn (gen_mma_xxmtacc (dst, dst)); > + if (GET_MODE (src) == XOmode) > + emit_insn (gen_mma_xxmtacc (dst, dst)); > > return; > } > @@ -16459,22 +16462,35 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA > - && GET_MODE (src) == PXImode && FP_REGNO_P (REGNO (src))) > + && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) > emit_insn (gen_mma_xxmfacc (src, src)); > > /* Move register range backwards, if we might have destructive > overlap. */ > int i; > - for (i = nregs - 1; i >= 0; i--) > - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode, > - i * reg_mode_size), > - simplify_gen_subreg (reg_mode, src, mode, > - i * reg_mode_size))); > + /* XO/OO are opaque so cannot use subregs. */ > + if (mode == OOmode || mode == XOmode ) > + { > + for (i = nregs - 1; i >= 0; i--) > + { > + rtx dst_i = gen_rtx_REG (reg_mode, REGNO (dst) + i); > + rtx src_i = gen_rtx_REG (reg_mode, REGNO (src) + i); > + emit_insn (gen_rtx_SET (dst_i, src_i)); > + } > + } > + else > + { > + for (i = nregs - 1; i >= 0; i--) > + emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode, > + i * reg_mode_size), > + simplify_gen_subreg (reg_mode, src, mode, > + i * reg_mode_size))); > + } > > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA > - && GET_MODE (dst) == PXImode && FP_REGNO_P (REGNO (dst))) > + && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); > } > else > @@ -16611,7 +16627,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA && REG_P (src) > - && GET_MODE (src) == PXImode && FP_REGNO_P (REGNO (src))) > + && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) > emit_insn (gen_mma_xxmfacc (src, src)); > > for (i = 0; i < nregs; i++) > @@ -16626,16 +16642,24 @@ rs6000_split_multireg_move (rtx dst, rtx src) > if (j == 0 && used_update) > continue; > > - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode, > - j * reg_mode_size), > - simplify_gen_subreg (reg_mode, src, mode, > - j * reg_mode_size))); > + /* XO/OO are opaque so cannot use subregs. */ > + if (mode == OOmode || mode == XOmode ) > + { > + rtx dst_i = gen_rtx_REG (reg_mode, REGNO (dst) + j); > + rtx src_i = gen_rtx_REG (reg_mode, REGNO (src) + j); > + emit_insn (gen_rtx_SET (dst_i, src_i)); > + } > + else > + emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode, > + j * reg_mode_size), > + simplify_gen_subreg (reg_mode, src, mode, > + j * reg_mode_size))); > } > > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA && REG_P (dst) > - && GET_MODE (dst) == PXImode && FP_REGNO_P (REGNO (dst))) > + && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); > > if (restore_basereg != NULL_RTX) > @@ -19865,7 +19889,8 @@ rs6000_mangle_type (const_tree type) > type = TYPE_MAIN_VARIANT (type); > > if (TREE_CODE (type) != VOID_TYPE && TREE_CODE (type) != BOOLEAN_TYPE > - && TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != REAL_TYPE) > + && TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != REAL_TYPE > + && TREE_CODE (type) != OPAQUE_TYPE) > return NULL; > > if (type == bool_char_type_node) return "U6__boolc"; > @@ -21753,6 +21778,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > outer_code, > } > break; > > + case UNSPEC: > + if (XINT (x, 1) == UNSPEC_MMA_XXSETACCZ) > + { > + *total = 0; > + return true; > + } > + break; > + > default: > break; > } > @@ -27186,14 +27219,14 @@ rs6000_invalid_conversion (const_tree fromtype, > const_tree totype) > > if (frommode != tomode) > { > - /* Do not allow conversions to/from PXImode and POImode types. */ > - if (frommode == PXImode) > + /* Do not allow conversions to/from XOmode and OOmode types. */ > + if (frommode == XOmode) > return N_("invalid conversion from type %<__vector_quad%>"); > - if (tomode == PXImode) > + if (tomode == XOmode) > return N_("invalid conversion to type %<__vector_quad%>"); > - if (frommode == POImode) > + if (frommode == OOmode) > return N_("invalid conversion from type %<__vector_pair%>"); > - if (tomode == POImode) > + if (tomode == OOmode) > return N_("invalid conversion to type %<__vector_pair%>"); > } > else if (POINTER_TYPE_P (fromtype) && POINTER_TYPE_P (totype)) > @@ -27202,19 +27235,19 @@ rs6000_invalid_conversion (const_tree fromtype, > const_tree totype) > frommode = TYPE_MODE (TREE_TYPE (fromtype)); > tomode = TYPE_MODE (TREE_TYPE (totype)); > > - /* Do not allow conversions to/from PXImode and POImode pointer > + /* Do not allow conversions to/from XOmode and OOmode pointer > types, except to/from void pointers. */ > if (frommode != tomode > && frommode != VOIDmode > && tomode != VOIDmode) > { > - if (frommode == PXImode) > + if (frommode == XOmode) > return N_("invalid conversion from type %<* __vector_quad%>"); > - if (tomode == PXImode) > + if (tomode == XOmode) > return N_("invalid conversion to type %<* __vector_quad%>"); > - if (frommode == POImode) > + if (frommode == OOmode) > return N_("invalid conversion from type %<* __vector_pair%>"); > - if (tomode == POImode) > + if (tomode == OOmode) > return N_("invalid conversion to type %<* __vector_pair%>"); > } > } > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 5a47aa14722..f35aaf4ffd1 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -1041,7 +1041,7 @@ enum data_align { align_abi, align_opt, align_both }; > /* Modes that are not vectors, but require vector alignment. Treat these like > vectors in terms of loads and stores. */ > #define VECTOR_ALIGNMENT_P(MODE) \ > - (FLOAT128_VECTOR_P (MODE) || (MODE) == POImode || (MODE) == PXImode) > + (FLOAT128_VECTOR_P (MODE) || (MODE) == OOmode || (MODE) == XOmode) > > #define ALTIVEC_VECTOR_MODE(MODE) \ > ((MODE) == V16QImode > \ > @@ -2556,6 +2556,7 @@ typedef struct GTY(()) machine_function > bool fpr_is_wrapped_separately[32]; > bool lr_is_wrapped_separately; > bool toc_is_wrapped_separately; > + bool mma_return_type_error; > } machine_function; > #endif > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 5e5ad9f7c3d..b3f77ec665c 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -778,7 +778,7 @@ (define_mode_attr BOOL_REGS_UNARY [(TI "r,0,0,wa,v") > ;; supplement addressing modes. > (define_mode_iterator RELOAD [V16QI V8HI V4SI V2DI V4SF V2DF V1TI > SF SD SI DF DD DI TI PTI KF IF TF > - POI PXI]) > + OO XO]) > > ;; Iterate over smin, smax > (define_code_iterator fp_minmax [smin smax]) > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > index 53843794a95..254af7f8f79 100755 > --- a/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > +++ b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > @@ -181,6 +181,9 @@ main (int argc, char *argv[]) > printf ("MMA double test fail: %d errors\n",ret); > else > printf ("MMA single test success: 0 MMA errors\n"); > +#else > + if (ret) > + abort(); > #endif > > return ret; > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > b/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > index ac4125ba329..ebbc5ae2e1b 100755 > --- a/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > +++ b/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > @@ -189,6 +189,9 @@ main (int argc, char *argv[]) > printf ("MMA single test fail: %d errors\n",ret); > else > printf ("MMA single test success: 0 MMA errors\n"); > +#else > + if (ret) > + abort(); > #endif > > return ret; > diff --git a/gcc/testsuite/gcc.target/powerpc/pr96506.c > b/gcc/testsuite/gcc.target/powerpc/pr96506-1.c > similarity index 61% > rename from gcc/testsuite/gcc.target/powerpc/pr96506.c > rename to gcc/testsuite/gcc.target/powerpc/pr96506-1.c > index b1b40c5a5c8..91835cec30c 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr96506.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr96506-1.c > @@ -40,27 +40,3 @@ foo3 (void) > vquad_t v; > bar3 (v); /* { dg-error "invalid use of MMA operand of type .__vector_quad. > as a function parameter" } */ > } > - > -__vector_pair > -foo4 (__vector_pair *src) /* { dg-error "invalid use of MMA type > .__vector_pair. as a function return value" } */ > -{ > - return *src; > -} > - > -vpair_t > -foo5 (vpair_t *src) /* { dg-error "invalid use of MMA type .__vector_pair. > as a function return value" } */ > -{ > - return *src; > -} > - > -__vector_quad > -foo6 (__vector_quad *src) /* { dg-error "invalid use of MMA type > .__vector_quad. as a function return value" } */ > -{ > - return *src; > -} > - > -vquad_t > -foo7 (vquad_t *src) /* { dg-error "invalid use of MMA type .__vector_quad. > as a function return value" } */ > -{ > - return *src; > -} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr96506-2.c > b/gcc/testsuite/gcc.target/powerpc/pr96506-2.c > new file mode 100644 > index 00000000000..9cffd2576c9 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr96506-2.c > @@ -0,0 +1,38 @@ > +/* PR target/96506 */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +extern void bar0(); > +extern void bar1(); > +extern void bar2(); > +extern void bar3(); > + > +typedef __vector_pair vpair_t; > +typedef __vector_quad vquad_t; > + > +/* Verify we flag errors on the following. */ > + > +__vector_pair > +foo4 (__vector_pair *src) > +{ /* { dg-error "invalid use of MMA type .__vector_pair. as a function > return value" } */ > + return *src; > +} > + > +vpair_t > +foo5 (vpair_t *src) > +{ /* { dg-error "invalid use of MMA type .__vector_pair. as a function > return value" } */ > + return *src; > +} > + > +__vector_quad > +foo6 (__vector_quad *src) > +{ /* { dg-error "invalid use of MMA type .__vector_quad. as a function > return value" } */ > + return *src; > +} > + > +vquad_t > +foo7 (vquad_t *src) > +{ /* { dg-error "invalid use of MMA type .__vector_quad. as a function > return value" } */ > + return *src; > +} > -- > 2.18.4 >