Hi! As Peter remarked, the input_operand's in the disassemble patterns are better as something more specific (input_operand has everything a "mov" pattern can handle, but pretty much nothing else can on a load/store architecture like Power -- it will likely still work, but only after reloads, so not optimal).
On Thu, Nov 19, 2020 at 12:58:47PM -0600, acsaw...@linux.ibm.com wrote: > Thanks for the reviews, here's the updated patch after fixing those things. > We now have an UNSPEC for xxsetaccz, and an accompanying change to > rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to replace it > with a bunch of register moves. > +;; The MMA patterns use the multi-register XOmode and OOmode opaque > +;; modes to implement the target specific __vector_quad and > +;; __vector_pair types that the MMA built-in functions reference. We > +;; use OPAQUE_MODE to prevent anything from trying to open them up. Great comment :-) > +(define_expand "mma_disassemble_pair" > + [(match_operand:V16QI 0 "mma_disassemble_output_operand") > + (match_operand:OO 1 "input_operand") > + (match_operand 2 "const_0_to_1_operand")] > + "TARGET_MMA" > +{ > + rtx src; > + int regoff = INTVAL (operands[2]); > + src = gen_rtx_UNSPEC (V16QImode, > + gen_rtvec (2, operands[1], GEN_INT (regoff)), > + UNSPEC_MMA_EXTRACT); > + emit_move_insn (operands[0], src); > + DONE; > +}) Please use tabs for every leading 8 spaces. > +(define_insn_and_split "*mma_disassemble_pair" > + [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa") > + (unspec:V16QI [(match_operand:OO 1 "input_operand" "wa") > + (match_operand 2 "const_0_to_1_operand")] > + UNSPEC_MMA_EXTRACT))] You do it for the last line here, but not the preceding two. There are no lines in mma.md that get this wrong right now :-) > @@ -14049,21 +14055,21 @@ mma_init_builtins (void) > } > else > { > - if ((attr & RS6000_BTC_QUAD) == 0) > + if ( !(d->code == MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL > + || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > + && (attr & RS6000_BTC_QUAD) == 0) > attr_args--; (surplus space right before the !) Okay for trunk with those nits fixed. Thanks! (The input_operand thing can be a follow-up patch; it *does* generate correct code like this, just not always optimal). Segher