Hi, Comments inlined. The updated patch will be sent in another email as this message is already long.
> Hi Robert, > > Next batch of comments. This set covers the rest of mips-msa.md. > > >+++ b/gcc/config/mips/mips-msa.md > >+(define_expand "vec_perm<mode>" > >+ [(match_operand:MSA 0 "register_operand") > >+ (match_operand:MSA 1 "register_operand") > >+ (match_operand:MSA 2 "register_operand") > >+ (match_operand:<VIMODE> 3 "register_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ /* The optab semantics are that index 0 selects the first element > >+ of operands[1] and the highest index selects the last element > >+ of operands[2]. This is the oppossite order from "vshf.df wd,rs,wt" > >+ where index 0 selects the first element of wt and the highest index > >+ selects the last element of ws. We therefore swap the operands here. > */ > >+ emit_insn (gen_msa_vshf<mode> (operands[0], operands[3], operands[2], > >+ operands[1])); > >+ DONE; > >+}) > > Can you make this the real instruction instead of msa_vshf and give it a > proper pattern (vec_select, vec_concat) etc. Swap the builtin to target > this pattern and swap the operands for the builtin expansion in C code > like you have done for some other patterns. Done. > > >+(define_expand "neg<mode>2" > >+ [(match_operand:IMSA 0 "register_operand") > >+ (match_operand:IMSA 1 "register_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ rtx reg = gen_reg_rtx (<MODE>mode); > >+ emit_insn (gen_msa_ldi<mode> (reg, const0_rtx)); > >+ emit_insn (gen_sub<mode>3 (operands[0], reg, operands[1])); > >+ DONE; > >+}) > >+ > >+(define_expand "neg<mode>2" > >+ [(match_operand:FMSA 0 "register_operand") > >+ (match_operand:FMSA 1 "register_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ rtx reg = gen_reg_rtx (<MODE>mode); > >+ emit_move_insn (reg, CONST0_RTX (<MODE>mode)); > >+ emit_insn (gen_sub<mode>3 (operands[0], reg, operands[1])); > >+ DONE; > >+}) > > Can't these two collapse into one like this? > > (define_expand "neg<mode>2" > [(set (match_operand:MSA 0 "register_operand") > (minus:MSA (match_dup 2) > (match_operand:MSA 1 "register_operand")))] > "ISA_HAS_MSA" > { > operands[2] = CONST0_RTX (<MODE>mode); > }) > > I'd hope the const_vector then gets emitted as an LDI? I haven't > checked that there is a pattern for using LDI for FP const_vector > moves. Done with minor changes. The above is close but we still need the CONST0_RTX to be placed in a register as we won't match the sub pattern. > > >+(define_expand "msa_ldi<mode>" > >+ [(match_operand:IMSA 0 "register_operand") > >+ (match_operand 1 "const_imm10_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ unsigned n_elts = GET_MODE_NUNITS (<MODE>mode); > >+ rtvec v = rtvec_alloc (n_elts); > >+ HOST_WIDE_INT val = INTVAL (operands[1]); > >+ unsigned int i; > >+ > >+ if (<MODE>mode != V16QImode) > >+ { > >+ unsigned shift = HOST_BITS_PER_WIDE_INT - 10; > >+ val = trunc_int_for_mode ((val << shift) >> shift, <UNITMODE>mode); > >+ } > >+ else > >+ val = trunc_int_for_mode (val, <UNITMODE>mode); > >+ > >+ for (i = 0; i < n_elts; i++) > >+ RTVEC_ELT (v, i) = GEN_INT (val); > >+ emit_move_insn (operands[0], > >+ gen_rtx_CONST_VECTOR (<MODE>mode, v)); > >+ DONE; > >+}) > > This is really weird. We shouldn't be simply discarding bits that don't fit. > This needs to accept all immediates and generate the correct code to > get a replicated constant of that value into a register. I think it is > probably OK to trunc_int_for_mode on the original 'val' for the > <UNIT>mode but anything out of range for V*HI/SI/DI needs to be expanded > properly. > > Please do not gen_msa_ldi anywhere other than from MSA builtins. There is > no need just emit a move directly. AFAICS, the truncation for everything except V16QImode is not needed since we have the predicate here. Truncating the immediate for bytes may make life easier for users when debugging. Although the extra bits are ignored by the hardware, it doesn't stop us from encoding numbers out of range. The RTL doesn't seem to have validation of ranges of constants and modes. I did a small test and could output any number within the allowable range in the predicate. > > >+(define_insn "msa_vshf<mode>" > >+ [(set (match_operand:MSA 0 "register_operand" "=f") > >+ (unspec:MSA [(match_operand:<VIMODE> 1 "register_operand" "0") > >+ (match_operand:MSA 2 "register_operand" "f") > >+ (match_operand:MSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_VSHF))] > >+ "ISA_HAS_MSA" > >+ "vshf.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_sld") > >+ (set_attr "mode" "<MODE>")]) > > Delete this and switch to using vec_perm directly instead for the builtin. Done. > > >+;; 128bit MSA modes only in msa registers or memory. An exception is > allowing > > 128-bit MSA modes can only exist in MSA registers or memory. ... Done. > > >+;; Offset load > >+(define_expand "msa_ld_<msafmt_f>" > >+ [(match_operand:MSA 0 "register_operand") > >+ (match_operand 1 "pmode_register_operand") > >+ (match_operand 2 "aq10<msafmt>_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], > >+ INTVAL (operands[2])); > >+ mips_emit_move (operands[0], gen_rtx_MEM (<MODE>mode, addr)); > >+ DONE; > >+}) > >+ > >+;; Offset store > >+(define_expand "msa_st_<msafmt_f>" > >+ [(match_operand:MSA 0 "register_operand") > >+ (match_operand 1 "pmode_register_operand") > >+ (match_operand 2 "aq10<msafmt>_operand")] > >+ "ISA_HAS_MSA" > >+{ > >+ rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], > >+ INTVAL (operands[2])); > >+ mips_emit_move (gen_rtx_MEM (<MODE>mode, addr), operands[0]); > >+ DONE; > >+}) > > There's no real need to expand these in C code. The patterns can be used > to create the RTL. As an aside, I don't really see the point in intrinsics > to load and store data the same thing can be done from straight C. > The patterns also can't be used for const or volatile data as their > builtin prototypes are neither. I suspect they should at least support > pointers to const data. I was going to remove the expansion in C code but the problem is that we need the Pmode. I could duplicate the patterns and override the icode e.g for TARGET_64BIT when expanding the built-ins but this doesn't look cleaner. Unless I'm missing something here? Since the intrinsics have been present for a while it would probably confuse if we removed them now. Indeed, the intrinsics are not really needed as we can do loads and stores from C. I changed the type in the prototype to CVPOINTER so qualifiers like const/volatile are not discarded. > > >+;; Integer operations > >+(define_insn "add<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f,f,f") > >+ (plus:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f,f,f") > >+ (match_operand:IMSA 2 "reg_or_vector_same_ximm5_operand" > "f,Unv5,Uuv5")))] > >+ "ISA_HAS_MSA" > >+{ > >+ switch (which_alternative) > >+ { > >+ case 0: > >+ return "addv.<msafmt>\t%w0,%w1,%w2"; > >+ case 1: > >+ { > >+ HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); > >+ > >+ operands[2] = GEN_INT (-val); > >+ return "subvi.<msafmt>\t%w0,%w1,%d2"; > >+ } > >+ case 2: > >+ { > >+ HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); > >+ > >+ operands[2] = GEN_INT (val); > >+ return "addvi.<msafmt>\t%w0,%w1,%d2"; > > This can use operands[2] unchanged, just use %E2 instead. Changed. > > >+ } > >+ default: > >+ gcc_unreachable (); > >+ } > >+} > >+ [(set_attr "alu_type" "simd_add") > >+ (set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "sub<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f,f,f") > >+ (minus:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f,f,f") > >+ (match_operand:IMSA 2 "reg_or_vector_same_ximm5_operand" > "f,Unv5,Uuv5")))] > >+ "ISA_HAS_MSA" > >+{ > >+ switch (which_alternative) > >+ { > >+ case 0: > >+ return "subv.<msafmt>\t%w0,%w1,%w2"; > >+ case 1: > >+ { > >+ HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); > >+ > >+ operands[2] = GEN_INT (-val); > >+ return "addvi.<msafmt>\t%w0,%w1,%d2"; > >+ } > >+ case 2: > >+ { > >+ HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); > >+ > >+ operands[2] = GEN_INT (val); > >+ return "subvi.<msafmt>\t%w0,%w1,%d2"; > >+ } > >+ default: > >+ gcc_unreachable (); > >+ } > >+} > >+ [(set_attr "alu_type" "simd_add") > >+ (set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > I don't believe we need to handle constants for the sub<mode>3 pattern if > we have them covered by add<mode>3. I can't see any canonicalisation rule > in rtl.texi to say this though but simplify_plus_minus seems to show this > to be true. If this is the case then we don't need the alternatives. Removed. > > >+(define_insn "xorv16qi3" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f,f") > >+ (xor:V16QI > >+ (match_operand:V16QI 1 "register_operand" "f,f") > >+ (match_operand:V16QI 2 "reg_or_vector_same_byte_operand" "f,Ubv8")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 1) > >+ { > >+ operands[2] = CONST_VECTOR_ELT (operands[2], 0); > >+ return "xori.b\t%w0,%w1,%B2"; > > I don't think you need to use %B2 it is already validated as a replicated > constant vector and in range so %E should be OK? This also means that you > don't need to do the output patterns in C code and can just use Changed. > "@ > alt0 > alt1" > > xori.b seems like it could be useful for other modes too when the immediate > has replicated bit patterns across all bytes of the element size. Done. As a result, I removed a set of predicates since they appeared to be redundant and added a new constraint that checks if all bytes and all the elements are the same. > > >+ } > >+ else > >+ return "xor.v\t%w0,%w1,%w2"; > >+} > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "TI,V16QI")]) > >+ > >+(define_insn "xor<mode>3" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f,f") > >+ (xor:IMSA_DWH > >+ (match_operand:IMSA_DWH 1 "register_operand" "f,f") > >+ (match_operand:IMSA_DWH 2 "reg_or_vector_same_<mode>_set_operand" > "f,YC")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 1) > >+ { > >+ HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); > >+ int vlog2 = exact_log2 (val); > >+ gcc_assert (vlog2 != -1); > >+ operands[2] = GEN_INT (vlog2); > >+ return "bnegi.%v0\t%w0,%w1,%2"; > > If this pattern occurs frequently then this should switch to a formatter and > be > handled in print_operand. Added 'V' operand. The helper functions for constraints YC, YZ had bugs when checking if a bit is set/cleared i.e. incorrect handling of MSB bit, GET_MODE_UNIT_SIZE used instead of GET_MODE_UNIT_BITSIZE. Fix included. > > >+ } > >+ else > >+ return "xor.v\t%w0,%w1,%w2"; > >+} > >+ [(set_attr "type" "simd_logic,simd_bit") > >+ (set_attr "mode" "TI,<MODE>")]) > >+ > >+(define_insn "iorv16qi3" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f,f") > >+ (ior:V16QI > >+ (match_operand:V16QI 1 "register_operand" "f,f") > >+ (match_operand:V16QI 2 "reg_or_vector_same_byte_operand" "f,Ubv8")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 1) > >+ { > >+ operands[2] = CONST_VECTOR_ELT (operands[2], 0); > >+ return "ori.b\t%w0,%w1,%B2"; > > Switch to %E and "@ output pattern syntax. Since it the pattern was combined with "ior<mode>3" and reworked, 'B' was repurposed to handle replicated bytes. > > >+ } > >+ else > >+ return "or.v\t%w0,%w1,%w2"; > >+} > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "TI,V16QI")]) > >+ > >+(define_insn "andv16qi3" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f,f") > >+ (and:V16QI > >+ (match_operand:V16QI 1 "register_operand" "f,f") > >+ (match_operand:V16QI 2 "reg_or_vector_same_byte_operand" "f,Ubv8")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 1) > >+ { > >+ operands[2] = CONST_VECTOR_ELT (operands[2], 0); > >+ return "andi.b\t%w0,%w1,%B2"; > > Likewise. As above but had to keep C code to flip bits for bit clearing insn. > > >+ } > >+ else > >+ return "and.v\t%w0,%w1,%w2"; > >+} > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "TI,V16QI")]) > >+ > >+(define_insn "vlshr<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f,f") > >+ (lshiftrt:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f,f") > >+ (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 0) > >+ return "srl.<msafmt>\t%w0,%w1,%w2"; > >+ > >+ operands[2] = GEN_INT (INTVAL (CONST_VECTOR_ELT (operands[2], 0)) > >+ & <shift_mask>); > > Is this here because this pattern is used to expand builtins? It would be > nice if something other than the output pattern had cleaned this up already > and issued a warning about shift overflow. Does this match what happens when > shifting by wider than a scalar type. It appears to be pointless to have this. There is an error for out-of-range values and an overflow/truncate warning if the immediate is larger than a byte. Switched to "@ output pattern with shift_mask mode attribute removed. > > >+ return "srli.<msafmt>\t%w0,%w1,%2"; > >+} > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "vashr<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f,f") > >+ (ashiftrt:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f,f") > >+ (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 0) > >+ return "sra.<msafmt>\t%w0,%w1,%w2"; > >+ > >+ operands[2] = GEN_INT (INTVAL (CONST_VECTOR_ELT (operands[2], 0)) > >+ & <shift_mask>); > > likewise. As above. > > >+ return "srai.<msafmt>\t%w0,%w1,%2"; > >+} > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "vashl<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f,f") > >+ (ashift:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f,f") > >+ (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (which_alternative == 0) > >+ return "sll.<msafmt>\t%w0,%w1,%w2"; > >+ > >+ operands[2] = GEN_INT (INTVAL (CONST_VECTOR_ELT (operands[2], 0)) > >+ & <shift_mask>); > > likewise. As above. > > >+ return "slli.<msafmt>\t%w0,%w1,%2"; > >+} > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > > ... > > >+(define_insn "msa_fmadd_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (plus:FMSA (mult:FMSA (match_operand:FMSA 2 "register_operand" "f") > >+ (match_operand:FMSA 3 "register_operand" "f")) > >+ (match_operand:FMSA 1 "register_operand" "0")))] > >+ "ISA_HAS_MSA" > >+ "fmadd.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_fmadd") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_fmsub_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (minus:FMSA (match_operand:FMSA 1 "register_operand" "0") > >+ (mult:FMSA (match_operand:FMSA 2 "register_operand" "f") > >+ (match_operand:FMSA 3 "register_operand" "f"))))] > >+ "ISA_HAS_MSA" > >+ "fmsub.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_fmadd") > >+ (set_attr "mode" "<MODE>")]) > > These need to be usable from the fma*4 pattern(s) as well. Changed to use SPN. > > >+;; Built-in functions > >+(define_insn "msa_add_a_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (plus:IMSA (abs:IMSA (match_operand:IMSA 1 "register_operand" "f")) > >+ (abs:IMSA (match_operand:IMSA 2 "register_operand" "f"))))] > >+ "ISA_HAS_MSA" > >+ "add_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_adds_a_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (ss_plus:IMSA > >+ (abs:IMSA (match_operand:IMSA 1 "register_operand" "f")) > >+ (abs:IMSA (match_operand:IMSA 2 "register_operand" "f"))))] > >+ "ISA_HAS_MSA" > >+ "adds_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This can be follow on work but we should be including fixed point vector > types for these I think. This applies to the following patterns > too. OK. > > >+ > >+(define_insn "ssadd<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (ss_plus:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "adds_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "usadd<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (us_plus:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "adds_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > > >+(define_expand "msa_addvi_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand") > >+ (plus:IMSA (match_operand:IMSA 1 "register_operand") > >+ (match_operand 2 "const_uimm5_operand")))] > >+ "ISA_HAS_MSA" > >+{ > >+ unsigned n_elts = GET_MODE_NUNITS (<MODE>mode); > >+ rtvec v = rtvec_alloc (n_elts); > >+ HOST_WIDE_INT val = INTVAL (operands[2]); > >+ unsigned int i; > >+ > >+ for (i = 0; i < n_elts; i++) > >+ RTVEC_ELT (v, i) = GEN_INT (val); > >+ > >+ emit_insn (gen_msa_addvi_<msafmt>_insn (operands[0], operands[1], > >+ gen_rtx_CONST_VECTOR (<MODE>mode, > >v))); > >+ DONE; > >+}) > > This pattern is too common. Please expand this in the builtin expand > C code. Done. > > >+ > >+(define_expand "msa_andi_b" > >+ [(set (match_operand:V16QI 0 "register_operand") > >+ (and:V16QI (match_operand:V16QI 1 "register_operand") > >+ (match_operand:QI 2 "const_uimm8_operand")))] > >+ "ISA_HAS_MSA" > >+{ > >+ rtvec v = rtvec_alloc (16); > >+ HOST_WIDE_INT val = INTVAL (operands[2]); > >+ unsigned int i; > >+ > >+ for (i = 0; i < 16; i++) > >+ RTVEC_ELT (v, i) = GEN_INT (val); > >+ > >+ emit_insn (gen_msa_andi_b_insn (operands[0], operands[1], > >+ gen_rtx_CONST_VECTOR (V16QImode, v))); > >+ DONE; > >+}) > > likewise. Removed as it is now handled by and<mode>3 pattern. > > >+ > >+(define_insn "msa_addvi_<msafmt>_insn" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (plus:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "const_vector_same_uimm5_operand" "")))] > >+ "ISA_HAS_MSA" > >+ "addvi.<msafmt>\t%w0,%w1,%E2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Why isn't this pattern part of the general msa add pattern which should > support both constants and registers as operand 2? It is also missing > a constraint for the immediate. Removed. > > >+ > >+(define_insn "msa_andi_b_insn" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (and:V16QI > >+ (match_operand:V16QI 1 "register_operand" "f") > >+ (match_operand:V16QI 2 "const_vector_same_uimm8_operand" "")))] > > Likewise. Same here. > > >+ "ISA_HAS_MSA" > >+ "andi.b\t%w0,%w1,%E2" > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "V16QI")]) > >+ > > END USEFUL commnets > > >+(define_insn "msa_bclr_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_BCLR))] > >+ "ISA_HAS_MSA" > >+ "bclr.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_bclri_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_BCLRI))] > >+ "ISA_HAS_MSA" > >+ "bclri.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_binsl_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand:IMSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_BINSL))] > >+ "ISA_HAS_MSA" > >+ "binsl.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_bitins") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_binsli_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand 3 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_BINSLI))] > >+ "ISA_HAS_MSA" > >+ "binsli.<msafmt>\t%w0,%w2,%3" > >+ [(set_attr "type" "simd_bitins") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_binsr_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand:IMSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_BINSR))] > >+ "ISA_HAS_MSA" > >+ "binsr.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_bitins") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_binsri_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand 3 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_BINSRI))] > >+ "ISA_HAS_MSA" > >+ "binsri.<msafmt>\t%w0,%w2,%3" > >+ [(set_attr "type" "simd_bitins") > >+ (set_attr "mode" "<MODE>")]) > > As a follow up these instructions may be represenatble with standard > RTL. The "insvm" appears to be a good candidate but I'm not sure if this is going to work out of the box for vector modes. This would apply only for instructions with a replicated immediate. > > >+(define_insn "msa_bmnz_v_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand:IMSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_BMNZ_V))] > >+ "ISA_HAS_MSA" > >+ "bmnz.v\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "TI")]) > > This is representable in RTL. AND, IOR, NOT etc. Done. > > >+ > >+(define_insn "msa_bmnzi_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "0") > >+ (match_operand:V16QI 2 "register_operand" "f") > >+ (match_operand 3 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_BMNZI_B))] > >+ "ISA_HAS_MSA" > >+ "bmnzi.b\t%w0,%w2,%3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "V16QI")]) > >+ > >+(define_insn "msa_bmz_v_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand:IMSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_BMZ_V))] > >+ "ISA_HAS_MSA" > >+ "bmz.v\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "TI")]) > >+ > >+(define_insn "msa_bmzi_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "0") > >+ (match_operand:V16QI 2 "register_operand" "f") > >+ (match_operand 3 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_BMZI_B))] > >+ "ISA_HAS_MSA" > >+ "bmzi.b\t%w0,%w2,%3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "V16QI")]) > > Likewise to here. Done as well. > > >+(define_insn "msa_bneg_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_BNEG))] > >+ "ISA_HAS_MSA" > >+ "bneg.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_bnegi_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_msa_branch_operand" "")] > >+ UNSPEC_MSA_BNEGI))] > >+ "ISA_HAS_MSA" > >+ "bnegi.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > > Can't see any way to represent these at the moment but like all the > unspecs they should be revisited to see if they can be targetted. I was going to remove bseti/bclri/bnegi as this is the same as or/and-not/xor operation with a mask with a single bit but realized that it would be impossible to distinguish the built-ins using the same insn code. I'll leave them for the time being. > > >+ > >+(define_insn "msa_bsel_v_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "0") > >+ (match_operand:IMSA 2 "register_operand" "f") > >+ (match_operand:IMSA 3 "register_operand" "f")] > >+ UNSPEC_MSA_BSEL_V))] > >+ "ISA_HAS_MSA" > >+ "bsel.v\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "TI")]) > > This can also be standard RTL. Done. > > >+ > >+(define_insn "msa_bseli_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "0") > >+ (match_operand:V16QI 2 "register_operand" "f") > >+ (match_operand 3 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_BSELI_B))] > >+ "ISA_HAS_MSA" > >+ "bseli.b\t%w0,%w2,%3" > >+ [(set_attr "type" "simd_bitmov") > >+ (set_attr "mode" "V16QI")]) > > Likewise. Done. > > >+(define_insn "msa_bset_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_BSET))] > >+ "ISA_HAS_MSA" > >+ "bset.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_bseti_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_BSETI))] > >+ "ISA_HAS_MSA" > >+ "bseti.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > > Can't see how to do these in standard RTL for now. > > For the builtins that accept immediates what happens when a user > provides an out-of-range constant? It would be good to have > warnings rather than ICEs. For the above we get a generic "invalid argument to built-in function" error. For other builtins with immediates we should get at least an overflow warning. > > >+(define_code_iterator ICC [eq le leu lt ltu]) > >+ > >+(define_code_attr icc > >+ [(eq "eq") > >+ (le "le_s") > >+ (leu "le_u") > >+ (lt "lt_s") > >+ (ltu "lt_u")]) > >+ > >+(define_code_attr icci > >+ [(eq "eqi") > >+ (le "lei_s") > >+ (leu "lei_u") > >+ (lt "lti_s") > >+ (ltu "lti_u")]) > >+ > >+(define_code_attr cmpi > >+ [(eq "s") > >+ (le "s") > >+ (leu "u") > >+ (lt "s") > >+ (ltu "u")]) > >+ > >+(define_insn "msa_c<ICC:icc>_<IMSA:msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (ICC:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "c<ICC:icc>.<IMSA:msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_c<ICC:icci>i_<IMSA:msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (ICC:IMSA > >+ (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "const_vector_same_cmp<ICC:cmpi>imm4_operand" > "")))] > >+ "ISA_HAS_MSA" > >+ "c<ICC:icci>.<IMSA:msafmt>\t%w0,%w1,%E2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Can't these be combined together with two alternatives? Done with some additional changes: a new constraint, predicates and constant vector generated from an immediate for the builtins. > > >+ > >+(define_insn "msa_c<ICC:icci>_<IMSA:msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(ICC:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_imm5_operand" ""))] > >+ UNSPEC_MSA_CMPI))] > >+ "ISA_HAS_MSA" > >+ "c<ICC:icci>.<IMSA:msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Do we need this separate instruction? It should be expanded to have a > replicated constant vector for operand 2 and just use the pattern above. AFAICS, we don't need this instruction and again was likely added for the convenience of mapping to builtins. This insn is now removed. > > >+(define_insn "msa_dotp_s_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_DOTP_S))] > >+ "ISA_HAS_MSA" > >+ "dotp_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > > We 'could' do this in standard RTL. Whether it would ever match is > questionable. Changed. > > >+(define_insn "msa_dotp_u_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_DOTP_U))] > >+ "ISA_HAS_MSA" > >+ "dotp_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_dpadd_s_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:IMSA_DWH 1 "register_operand" "0") > >+ (match_operand:<VHMODE> 2 "register_operand" "f") > >+ (match_operand:<VHMODE> 3 "register_operand" "f")] > >+ UNSPEC_MSA_DPADD_S))] > >+ "ISA_HAS_MSA" > >+ "dpadd_s.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_dpadd_u_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:IMSA_DWH 1 "register_operand" "0") > >+ (match_operand:<VHMODE> 2 "register_operand" "f") > >+ (match_operand:<VHMODE> 3 "register_operand" "f")] > >+ UNSPEC_MSA_DPADD_U))] > >+ "ISA_HAS_MSA" > >+ "dpadd_u.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_dpsub_s_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:IMSA_DWH 1 "register_operand" "0") > >+ (match_operand:<VHMODE> 2 "register_operand" "f") > >+ (match_operand:<VHMODE> 3 "register_operand" "f")] > >+ UNSPEC_MSA_DPSUB_S))] > >+ "ISA_HAS_MSA" > >+ "dpsub_s.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_dpsub_u_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:IMSA_DWH 1 "register_operand" "0") > >+ (match_operand:<VHMODE> 2 "register_operand" "f") > >+ (match_operand:<VHMODE> 3 "register_operand" "f")] > >+ UNSPEC_MSA_DPSUB_U))] > >+ "ISA_HAS_MSA" > >+ "dpsub_u.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > > Likewise to here. Done. > > >+ > >+(define_insn "msa_fclass_<msafmt>" > >+ [(set (match_operand:<VIMODE> 0 "register_operand" "=f") > >+ (unspec:<VIMODE> [(match_operand:FMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_FCLASS))] > >+ "ISA_HAS_MSA" > >+ "fclass.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fclass") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_fcaf_<msafmt>" > >+ [(set (match_operand:<VIMODE> 0 "register_operand" "=f") > >+ (unspec:<VIMODE> [(match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_FCAF))] > >+ "ISA_HAS_MSA" > >+ "fcaf.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcmp") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_fcune_<FMSA:msafmt>" > >+ [(set (match_operand:<VIMODE> 0 "register_operand" "=f") > >+ (unspec:<VIMODE> [(match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_FCUNE))] > >+ "ISA_HAS_MSA" > >+ "fcune.<FMSA:msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcmp") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_code_iterator FCC [unordered ordered eq ne le lt uneq unle unlt]) > >+ > >+(define_code_attr fcc > >+ [(unordered "fcun") > >+ (ordered "fcor") > >+ (eq "fceq") > >+ (ne "fcne") > >+ (uneq "fcueq") > >+ (unle "fcule") > >+ (unlt "fcult") > >+ (le "fcle") > >+ (lt "fclt")]) > >+ > >+(define_int_iterator FSC_UNS [UNSPEC_MSA_FSAF UNSPEC_MSA_FSUN > >UNSPEC_MSA_FSOR > >+ UNSPEC_MSA_FSEQ UNSPEC_MSA_FSNE UNSPEC_MSA_FSUEQ > >+ UNSPEC_MSA_FSUNE UNSPEC_MSA_FSULE UNSPEC_MSA_FSULT > >+ UNSPEC_MSA_FSLE UNSPEC_MSA_FSLT]) > >+ > >+(define_int_attr fsc > >+ [(UNSPEC_MSA_FSAF "fsaf") > >+ (UNSPEC_MSA_FSUN "fsun") > >+ (UNSPEC_MSA_FSOR "fsor") > >+ (UNSPEC_MSA_FSEQ "fseq") > >+ (UNSPEC_MSA_FSNE "fsne") > >+ (UNSPEC_MSA_FSUEQ "fsueq") > >+ (UNSPEC_MSA_FSUNE "fsune") > >+ (UNSPEC_MSA_FSULE "fsule") > >+ (UNSPEC_MSA_FSULT "fsult") > >+ (UNSPEC_MSA_FSLE "fsle") > >+ (UNSPEC_MSA_FSLT "fslt")]) > >+ > >+(define_insn "msa_<FCC:fcc>_<FMSA:msafmt>" > >+ [(set (match_operand:<VIMODE> 0 "register_operand" "=f") > >+ (FCC:<VIMODE> (match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "<FCC:fcc>.<FMSA:msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcmp") > >+ (set_attr "mode" "<MODE>")]) > > Can't the msa builtins target these patterns directly? Follow on work should > implement "mov<mode>cc" for vector modes. > > >+ > >+(define_insn "msa_<fsc>_<FMSA:msafmt>" > >+ [(set (match_operand:<VIMODE> 0 "register_operand" "=f") > >+ (unspec:<VIMODE> [(match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")] > >+ FSC_UNS))] > >+ "ISA_HAS_MSA" > >+ "<fsc>.<FMSA:msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcmp") > >+ (set_attr "mode" "<MODE>")]) > > i.e. this is not necessary. The difference between this and above is the signalling vs quiet NaNs floating point comparison instructions. The quiet NaNs are represented as standard RTL and signalling as UNSPEC. "mov<mode>cc" for vector modes is questionable as "vcondmn" is meant to be used for conditional vector moves unless I misunderstood something. > > >+(define_mode_attr FINT > >+ [(V4SF "V4SI") > >+ (V2DF "V2DI")]) > >+ > >+(define_mode_attr fint > >+ [(V4SF "v4si") > >+ (V2DF "v2di")]) > >+ > >+(define_mode_attr FQ > >+ [(V4SF "V8HI") > >+ (V2DF "V4SI")]) > >+ > >+(define_mode_attr FINTCNV > >+ [(V4SF "I2S") > >+ (V2DF "I2D")]) > >+ > >+(define_mode_attr FINTCNV_2 > >+ [(V4SF "S2I") > >+ (V2DF "D2I")]) > >+ > >+(define_insn "float<fint><FMSA:mode>2" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (float:FMSA (match_operand:<FINT> 1 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "ffint_s.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "<FINTCNV>") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "floatuns<fint><FMSA:mode>2" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unsigned_float:FMSA (match_operand:<FINT> 1 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "ffint_u.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "<FINTCNV>") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_mode_attr FFQ > >+ [(V4SF "V8HI") > >+ (V2DF "V4SI")]) > >+ > >+(define_insn "msa_ffql_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:<FQ> 1 "register_operand" "f")] > >+ UNSPEC_MSA_FFQL))] > >+ "ISA_HAS_MSA" > >+ "ffql.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "<FINTCNV>") > >+ (set_attr "mode" "<MODE>")]) > > There are fixed point vector modes in GCC. Perhaps improve fixed point support > in follow on work. Ok. > > >+(define_insn "msa_ffqr_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:<FQ> 1 "register_operand" "f")] > >+ UNSPEC_MSA_FFQR))] > >+ "ISA_HAS_MSA" > >+ "ffqr.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "<FINTCNV>") > >+ (set_attr "mode" "<MODE>")]) > > Likewise. > > >+ > >+;; Note used directly by builtins but via the following define_expand. > >+(define_insn "msa_fill_<msafmt>_insn" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (vec_duplicate:IMSA > >+ (match_operand:<UNITMODE> 1 "reg_or_0_operand" "dJ")))] > >+ "ISA_HAS_MSA" > >+ "fill.<msafmt>\t%w0,%z1" > >+ [(set_attr "type" "simd_fill") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+;; Expand builtin for HImode and QImode which takes SImode. > >+(define_expand "msa_fill_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand") > >+ (vec_duplicate:IMSA > >+ (match_operand:<RES> 1 "reg_or_0_operand")))] > >+ "ISA_HAS_MSA" > >+{ > >+ if ((GET_MODE_SIZE (<UNITMODE>mode) < GET_MODE_SIZE (<RES>mode)) > >+ && (REG_P (operands[1]) || (GET_CODE (operands[1]) == SUBREG > >+ && REG_P (SUBREG_REG (operands[1]))))) > >+ operands[1] = lowpart_subreg (<UNITMODE>mode, operands[1], <RES>mode); > >+ emit_insn (gen_msa_fill_<msafmt>_insn (operands[0], operands[1])); > >+ DONE; > >+}) > > Let's not do this. Just change the builtin prototypes instead and have > them match naturally. Done. > > >+(define_insn "msa_fill_<msafmt_f>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (vec_duplicate:FMSA > >+ (match_operand:<UNITMODE> 1 "reg_or_0_operand" "dJ")))] > >+ "ISA_HAS_MSA" > >+ "fill.<msafmt>\t%w0,%z1" > >+ [(set_attr "type" "simd_fill") > >+ (set_attr "mode" "<MODE>")]) > > Separate out the zero alternative and use LDI. I'm not certain but I > think we should be emitting '#' for the V2DI and V2DF mode non-zero > cases, for safety if nothing else. It's probably more consistent to have the explicit split. Changed. > > >+;; Note that fill.d and fill.d_f will be split later if !TARGET_64BIT. > >+(define_split > >+ [(set (match_operand:V2DI 0 "register_operand") > >+ (vec_duplicate:V2DI > >+ (match_operand:DI 1 "reg_or_0_operand")))] > >+ "reload_completed && TARGET_MSA && !TARGET_64BIT" > >+ [(const_int 0)] > >+{ > >+ mips_split_msa_fill_d (operands[0], operands[1]); > >+ DONE; > >+}) > > The 'or_0' should be unnecessary here. Indeed. Updated. I also realized that TARGET_MSA is used inconsistently and changed a few cases to ISA_HAS_MSA. > > >+(define_insn "smax<mode>3" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (smax:FMSA (match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "fmax.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fminmax") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "umax<mode>3" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (umax:FMSA (match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "fmax_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fminmax") > >+ (set_attr "mode" "<MODE>")]) > > Are you sure the semantics of fmax_a and umax match? I.e. is umax > supposed to use the magitude of an FP value and ignore the sign > bit. It seems a bit odd to me. Hmm. This is plain wrong and not sure how this got here. I changed it to standard RTL using if_then_else. I'm not sure if we can ever match it though. As it stands the auto-vectorizer does not seem to pick this up (for integers as well) using a simple loop with abs(es) and a ternary operator. Further investigation and improvements will follow. > > >+(define_insn "smin<mode>3" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (smin:FMSA (match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "fmin.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fminmax") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "umin<mode>3" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (umin:FMSA (match_operand:FMSA 1 "register_operand" "f") > >+ (match_operand:FMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "fmin_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fminmax") > >+ (set_attr "mode" "<MODE>")]) > > Likewise. As above. > > >+ > >+(define_insn "msa_frcp_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:FMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_FRCP))] > >+ "ISA_HAS_MSA" > >+ "frcp.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fdiv") > >+ (set_attr "mode" "<MODE>")]) > > This can be standard RTL with unsafe math opts I believe. Can be left > as a follow on. Will do. > > >+(define_insn "msa_frsqrt_<msafmt>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:FMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_FRSQRT))] > >+ "ISA_HAS_MSA" > >+ "frsqrt.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_fdiv") > >+ (set_attr "mode" "<MODE>")]) > > Likewise. > > >+(define_insn "msa_ftq_h" > >+ [(set (match_operand:V8HI 0 "register_operand" "=f") > >+ (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "f") > >+ (match_operand:V4SF 2 "register_operand" "f")] > >+ UNSPEC_MSA_FTQ))] > >+ "ISA_HAS_MSA" > >+ "ftq.h\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "S2I") > >+ (set_attr "mode" "V4SF")]) > > These should be made into standard RTL as follow on work to add fixed > point modes. > > >+(define_insn "msa_ftq_w" > >+ [(set (match_operand:V4SI 0 "register_operand" "=f") > >+ (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") > >+ (match_operand:V2DF 2 "register_operand" "f")] > >+ UNSPEC_MSA_FTQ))] > >+ "ISA_HAS_MSA" > >+ "ftq.w\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "cnv_mode" "D2I") > >+ (set_attr "mode" "V2DF")]) > > Likewise. Will do as well. > > >+ > >+(define_insn "msa_hadd_s_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_HADD_S))] > >+ "ISA_HAS_MSA" > >+ "hadd_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This can be standard RTL. > > >+ > >+(define_insn "msa_hadd_u_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_HADD_U))] > >+ "ISA_HAS_MSA" > >+ "hadd_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_hsub_s_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_HSUB_S))] > >+ "ISA_HAS_MSA" > >+ "hsub_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_hsub_u_<msafmt>" > >+ [(set (match_operand:IMSA_DWH 0 "register_operand" "=f") > >+ (unspec:IMSA_DWH [(match_operand:<VHMODE> 1 "register_operand" "f") > >+ (match_operand:<VHMODE> 2 "register_operand" "f")] > >+ UNSPEC_MSA_HSUB_U))] > >+ "ISA_HAS_MSA" > >+ "hsub_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Likewise to here. Updated. I had to add a new 'addsub' iterator in mips.md to simplify it. > > I have not reviewed interleave patterns in detail as they all look OK and > I trust they 'do the right thing'. > > >+(define_insn "msa_madd_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "0") > >+ (match_operand:IMSA_WH 2 "register_operand" "f") > >+ (match_operand:IMSA_WH 3 "register_operand" "f")] > >+ UNSPEC_MSA_MADD_Q))] > >+ "ISA_HAS_MSA" > >+ "madd_q.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_maddr_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "0") > >+ (match_operand:IMSA_WH 2 "register_operand" "f") > >+ (match_operand:IMSA_WH 3 "register_operand" "f")] > >+ UNSPEC_MSA_MADDR_Q))] > >+ "ISA_HAS_MSA" > >+ "maddr_q.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > > Follow on work for fixed point. Ok. > > >+(define_insn "msa_max_a_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_MAX_A))] > >+ "ISA_HAS_MSA" > >+ "max_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This could be used as part of mov<mode>cc support. AFAUI, mov<mode>cc is used for non-vector modes and the support should go into vcond[u]mn SPNs. However, for the above and floating-point version I'm not sure if we can add it there as we don't have enough information as it appears to be simplified to a basic comparison. I haven't confirmed it yet but it would appear that the support for these operations would have to be explicitly added to the auto-vectorizer to recognize these patterns. > > >+(define_insn "smax<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (smax:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "max_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "umax<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (umax:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "max_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_maxi_s_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_imm5_operand" "")] > >+ UNSPEC_MSA_MAXI_S))] > >+ "ISA_HAS_MSA" > >+ "maxi_s.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This should be part of the instructions above with a second alternative. > > >+ > >+(define_insn "msa_maxi_u_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm5_operand" "")] > >+ UNSPEC_MSA_MAXI_U))] > >+ "ISA_HAS_MSA" > >+ "maxi_u.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Likewise. All updated. > > >+(define_insn "msa_min_a_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_MIN_A))] > >+ "ISA_HAS_MSA" > >+ "min_a.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This could be used as part of mov<mode>cc support. > > >+(define_insn "smin<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (smin:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "min_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "umin<mode>3" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (umin:IMSA (match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "min_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_mini_s_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_imm5_operand" "")] > >+ UNSPEC_MSA_MINI_S))] > >+ "ISA_HAS_MSA" > >+ "mini_s.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This should be merged with the instructions above. Likewise. > > >+ > >+(define_insn "msa_mini_u_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm5_operand" "")] > >+ UNSPEC_MSA_MINI_U))] > >+ "ISA_HAS_MSA" > >+ "mini_u.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Likewise. > > >+ > >+(define_insn "msa_msub_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "0") > >+ (match_operand:IMSA_WH 2 "register_operand" "f") > >+ (match_operand:IMSA_WH 3 "register_operand" "f")] > >+ UNSPEC_MSA_MSUB_Q))] > >+ "ISA_HAS_MSA" > >+ "msub_q.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_msubr_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "0") > >+ (match_operand:IMSA_WH 2 "register_operand" "f") > >+ (match_operand:IMSA_WH 3 "register_operand" "f")] > >+ UNSPEC_MSA_MSUBR_Q))] > >+ "ISA_HAS_MSA" > >+ "msubr_q.<msafmt>\t%w0,%w2,%w3" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_mul_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "f") > >+ (match_operand:IMSA_WH 2 "register_operand" "f")] > >+ UNSPEC_MSA_MUL_Q))] > >+ "ISA_HAS_MSA" > >+ "mul_q.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_mulr_q_<msafmt>" > >+ [(set (match_operand:IMSA_WH 0 "register_operand" "=f") > >+ (unspec:IMSA_WH [(match_operand:IMSA_WH 1 "register_operand" "f") > >+ (match_operand:IMSA_WH 2 "register_operand" "f")] > >+ UNSPEC_MSA_MULR_Q))] > >+ "ISA_HAS_MSA" > >+ "mulr_q.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_mul") > >+ (set_attr "mode" "<MODE>")]) > > Rework when adding proper fixed point mode support. Ok. > > >+(define_insn "msa_nloc_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_NLOC))] > >+ "ISA_HAS_MSA" > >+ "nloc.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "clz<mode>2" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (clz:IMSA (match_operand:IMSA 1 "register_operand" "f")))] > >+ "ISA_HAS_MSA" > >+ "nlzc.<msafmt>\t%w0,%w1" > >+ [(set_attr "type" "simd_bit") > >+ (set_attr "mode" "<MODE>")]) > > Can you confirm that nlzc has a natural value when given an operand of zero? > I.e. 8 for B 16 for H 32 for W and 64 for D? > > Also CLZ_DEFINED_VALUE_AT_ZERO looks like it needs updating to know the > bitsize of elements in a vector rather than the whole vector. I.e. I think > it will say the result is 128 for any vector mode. I'm not so sure about this one. The above pattern as it stands doesn't accept the zero operand, would the macro would still matter? The built-in also rejects the zero argument. It appears that the code around, where the macro is used, doesn't handle vector modes and the modification of this macro may potentially lead to strange errors. AFAICS, it's very unlikely that we ever hit those paths so it appears we are safe. For the sake of consistency, I think that we still should consider vector elements so I updated the macro to use GET_MODE_UNIT_BITSIZE. > > >+ > >+(define_insn "msa_nor_v_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (and:IMSA (not:IMSA (match_operand:IMSA 1 "register_operand" "f")) > >+ (not:IMSA (match_operand:IMSA 2 "register_operand" "f"))))] > >+ "ISA_HAS_MSA" > >+ "nor.v\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "TI")]) > >+ > >+(define_insn "msa_nori_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_NORI_B))] > >+ "ISA_HAS_MSA" > >+ "nori.b\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "V16QI")]) > > This can be standard rtl and joined with the previous insn. The immediate > for HI/SI/DI patterns simply must have replicates byte values as well as > elements. Done. > > >+ > >+(define_insn "msa_ori_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (ior:V16QI (match_operand:V16QI 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm8_operand" "")))] > >+ "ISA_HAS_MSA" > >+ "ori.b\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "V16QI")]) > > This should be part of register based OR with the same rules as above. Done. > > >+(define_insn "msa_shf_<msafmt>" > >+ [(set (match_operand:IMSA_WHB 0 "register_operand" "=f") > >+ (unspec:IMSA_WHB [(match_operand:IMSA_WHB 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_SHF))] > >+ "ISA_HAS_MSA" > >+ "shf.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shf") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_shf_w_f" > >+ [(set (match_operand:V4SF 0 "register_operand" "=f") > >+ (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_SHF))] > >+ "ISA_HAS_MSA" > >+ "shf.w\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shf") > >+ (set_attr "mode" "V4SF")]) > > These seem representable in RTL albeit ugly. Perhaps look at this in a > follow up. Done. I added some helpers to plug this into vec_perm_const pattern. > > >+(define_insn "msa_slli_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_SLLI))] > >+ "ISA_HAS_MSA" > >+ "slli.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > > vashl, vashr, vlshr SPNs cover these. At least some if not all are > already defined in this file so place switch the builtins to target > them instead of these unspecs. Done. > > >+(define_insn "msa_srai_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_SRAI))] > >+ "ISA_HAS_MSA" > >+ "srai.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_srar_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SRAR))] > >+ "ISA_HAS_MSA" > >+ "srar.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > > The shifts with rounding can stay of course. > > >+ > >+(define_insn "msa_srari_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_SRARI))] > >+ "ISA_HAS_MSA" > >+ "srari.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_srli_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_SRLI))] > >+ "ISA_HAS_MSA" > >+ "srli.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_srlr_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SRLR))] > >+ "ISA_HAS_MSA" > >+ "srlr.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_srlri_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<bitimm>_operand" "")] > >+ UNSPEC_MSA_SRLRI))] > >+ "ISA_HAS_MSA" > >+ "srlri.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_shift") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_subs_s_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SUBS_S))] > >+ "ISA_HAS_MSA" > >+ "subs_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > Due for update in a follow up adding fixed point mode support. Ok. > > >+ > >+(define_insn "msa_subs_u_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SUBS_U))] > >+ "ISA_HAS_MSA" > >+ "subs_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_subsuu_s_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SUBSUU_S))] > >+ "ISA_HAS_MSA" > >+ "subsuu_s.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_subsus_u_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand:IMSA 2 "register_operand" "f")] > >+ UNSPEC_MSA_SUBSUS_U))] > >+ "ISA_HAS_MSA" > >+ "subsus_u.<msafmt>\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > likewise to here. > > >+ > >+(define_insn "msa_subvi_<msafmt>" > >+ [(set (match_operand:IMSA 0 "register_operand" "=f") > >+ (unspec:IMSA [(match_operand:IMSA 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm5_operand" "")] > >+ UNSPEC_MSA_SUBVI))] > >+ "ISA_HAS_MSA" > >+ "subvi.<msafmt>\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_int_arith") > >+ (set_attr "mode" "<MODE>")]) > > This should be part of the simple sub<mode>3 pattern and that pattern > should be used by the builtin. Done. > > >+ > >+(define_insn "msa_xori_b" > >+ [(set (match_operand:V16QI 0 "register_operand" "=f") > >+ (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f") > >+ (match_operand 2 "const_uimm8_operand" "")] > >+ UNSPEC_MSA_XORI_B))] > >+ "ISA_HAS_MSA" > >+ "xori.b\t%w0,%w1,%2" > >+ [(set_attr "type" "simd_logic") > >+ (set_attr "mode" "V16QI")]) > > Similar issues as ANDI and ORI. Likewise. > > >+(define_insn "msa_sld_<msafmt_f>" > >+ [(set (match_operand:MSA 0 "register_operand" "=f") > >+ (unspec:MSA [(match_operand:MSA 1 "register_operand" "0") > >+ (match_operand:MSA 2 "register_operand" "f") > >+ (match_operand:SI 3 "reg_or_0_operand" "dJ")] > >+ UNSPEC_MSA_SLD))] > >+ "ISA_HAS_MSA" > >+ "sld.<msafmt>\t%w0,%w2[%z3]" > >+ [(set_attr "type" "simd_sld") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_sldi_<msafmt_f>" > >+ [(set (match_operand:MSA 0 "register_operand" "=f") > >+ (unspec:MSA [(match_operand:MSA 1 "register_operand" "0") > >+ (match_operand:MSA 2 "register_operand" "f") > >+ (match_operand 3 "const_<indeximm>_operand" "")] > >+ UNSPEC_MSA_SLDI))] > >+ "ISA_HAS_MSA" > >+ "sldi.<msafmt>\t%w0,%w2[%3]" > >+ [(set_attr "type" "simd_sld") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_insn "msa_splat_<msafmt_f>" > >+ [(set (match_operand:MSA 0 "register_operand" "=f") > >+ (unspec:MSA [(match_operand:MSA 1 "register_operand" "f") > >+ (match_operand:SI 2 "reg_or_0_operand" "dJ")] > >+ UNSPEC_MSA_SPLAT))] > >+ "ISA_HAS_MSA" > >+ "splat.<msafmt>\t%w0,%w1[%z2]" > >+ [(set_attr "type" "simd_splat") > >+ (set_attr "mode" "<MODE>")]) > > This doesn't need 'or_0' support as the splati below covers it. I think > this is targetable but seem to recall an off-list discussion that says > it can never be generated even if represented with an appropriate > pattern. That's correct. Another patch for the middle-end is needed but this will be part of the follow up work. > > >+ > >+(define_insn "msa_splati_<msafmt_f>" > >+ [(set (match_operand:MSA 0 "register_operand" "=f") > >+ (unspec:MSA [(match_operand:MSA 1 "register_operand" "f") > >+ (match_operand 2 "const_<indeximm>_operand" "")] > >+ UNSPEC_MSA_SPLATI))] > >+ "ISA_HAS_MSA" > >+ "splati.<msafmt>\t%w0,%w1[%2]" > >+ [(set_attr "type" "simd_splat") > >+ (set_attr "mode" "<MODE>")]) > > This seems targettable with standard RTL. I changed this but I don't expect to be working for the same reason as above. > > >+ > >+;; Operand 1 is a scalar. > >+(define_insn "msa_splati_<msafmt_f>_s" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:<UNITMODE> 1 "register_operand" "f") > >+ (match_operand 2 "const_<indeximm>_operand" "")] > >+ UNSPEC_MSA_SPLATI))] > >+ "ISA_HAS_MSA" > >+ "splati.<msafmt>\t%w0,%w1[%2]" > >+ [(set_attr "type" "simd_splat") > >+ (set_attr "mode" "<MODE>")]) > > I don't understand this pattern. Why is there an element selector for > a scalar input. Isn't it just element 0 hard-coded? It appears so. I renamed to *_scalar and removed the need for selector. Now it's hardcoded to 0. > > >+(define_insn "msa_cfcmsa" > >+ [(set (match_operand:SI 0 "register_operand" "=d") > >+ (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand" "")] > >+ UNSPEC_MSA_CFCMSA))] > >+ "ISA_HAS_MSA" > >+ "cfcmsa\t%0,$%1" > >+ [(set_attr "type" "simd_cmsa") > >+ (set_attr "mode" "SI")]) > >+ > >+(define_insn "msa_ctcmsa" > >+ [(unspec_volatile [(match_operand 0 "const_uimm5_operand" "") > >+ (match_operand:SI 1 "register_operand" "d")] > >+ UNSPEC_MSA_CTCMSA)] > >+ "ISA_HAS_MSA" > >+ "ctcmsa\t$%0,%1" > >+ [(set_attr "type" "simd_cmsa") > >+ (set_attr "mode" "SI")]) > > Just noting that the CTCMSA instruction is defined with arguments backwards > to other ctc* instructions in the base arch. The copro register always goes > second apart from CTCMSA. > > >+ > >+(define_insn "msa_fexdo_h" > >+ [(set (match_operand:V8HI 0 "register_operand" "=f") > >+ (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "f") > >+ (match_operand:V4SF 2 "register_operand" "f")] > >+ UNSPEC_MSA_FEXDO))] > >+ "ISA_HAS_MSA" > >+ "fexdo.h\t%w0,%w1,%w2" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V8HI")]) > >+ > >+(define_insn "msa_fexdo_w" > >+ [(set (match_operand:V4SF 0 "register_operand" "=f") > >+ (vec_concat:V4SF > >+ (float_truncate:V2SF (match_operand:V2DF 1 "register_operand" "f")) > >+ (float_truncate:V2SF (match_operand:V2DF 2 "register_operand" "f"))))] > >+ "ISA_HAS_MSA" > >+ "fexdo.w\t%w0,%w2,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V4SF")]) > >+ > >+(define_insn "msa_fexupl_w" > >+ [(set (match_operand:V4SF 0 "register_operand" "=f") > >+ (unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")] > >+ UNSPEC_MSA_FEXUPL))] > >+ "ISA_HAS_MSA" > >+ "fexupl.w\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V4SF")]) > >+ > >+(define_insn "msa_fexupl_d" > >+ [(set (match_operand:V2DF 0 "register_operand" "=f") > >+ (unspec:V2DF [(match_operand:V4SF 1 "register_operand" "f")] > >+ UNSPEC_MSA_FEXUPL))] > >+ "ISA_HAS_MSA" > >+ "fexupl.d\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V2DF")]) > > This should be possible with float_extend similar to fexdo_w Done. > > >+ > >+(define_insn "msa_fexupr_w" > >+ [(set (match_operand:V4SF 0 "register_operand" "=f") > >+ (unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")] > >+ UNSPEC_MSA_FEXUPR))] > >+ "ISA_HAS_MSA" > >+ "fexupr.w\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V4SF")]) > > Likewise. > > >+(define_insn "msa_fexupr_d" > >+ [(set (match_operand:V2DF 0 "register_operand" "=f") > >+ (unspec:V2DF [(match_operand:V4SF 1 "register_operand" "f")] > >+ UNSPEC_MSA_FEXUPR))] > >+ "ISA_HAS_MSA" > >+ "fexupr.d\t%w0,%w1" > >+ [(set_attr "type" "simd_fcvt") > >+ (set_attr "mode" "V2DF")]) > >+ > >+(define_insn "msa_branch_nz_v_<msafmt_f>" > >+ [(set (pc) (if_then_else > >+ (ne (unspec:SI [(match_operand:MSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_BNZ_V) > >+ (match_operand:SI 2 "const_0_operand")) > >+ (label_ref (match_operand 0)) > >+ (pc)))] > >+ "ISA_HAS_MSA" > >+{ > >+ return mips_output_conditional_branch (insn, operands, > >+ MIPS_BRANCH ("bnz.v", "%w1,%0"), > >+ MIPS_BRANCH ("bz.v", "%w1,%0")); > >+} > >+ [(set_attr "type" "simd_branch") > >+ (set_attr "mode" "TI")]) > > This needs updating for compact branch logic. See the floating point branches > for reference, it must be attribute compact_form==never. > This can be a proper RTL pattern quite easily with a comparison against > a const_vector with all zeros. The eq and ne cases should be merged > together with a template (see other branches). > > >+ > >+(define_expand "msa_bnz_v_<msafmt_f>" > >+ [(set (match_operand:SI 0 "register_operand" "=d") > >+ (unspec:SI [(match_operand:MSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_TSTNZ_V))] > >+ "ISA_HAS_MSA" > >+{ > >+ mips_expand_msa_branch (operands, gen_msa_branch_nz_v_<MSA:msafmt_f>); > >+ DONE; > >+}) > > I find these instruction definitions very weird. Why is it a good thing to > expose branches as intrinsics that end up with a GPR value that has to then > be used in another branch to create control flow? > Iff there is some value to these then the function name for expanding them > is a bit misleading as it is not really branching anywhere it is just setting > a GPR to 1 or 0 which happens to include a set of if-then-else branchs. It > would be nice to avoid having a define_expand at all here and use a custom > builtin type to expand it entirely in C. AFAICS, there is no way to have control flow intrinsics. With this approach, indeed, it looks odd to set a GP register with 0/1, however, if this is combined with passes like -fgcse and -ftree-loop-vectorize this seemingly silly double branching are optimized quite well e.g. for "if (__builtin_msa_bnz_b (...)) {} else {}". That's what I observed from test cases. I removed all define_expands and reworked it to have more specialized built-ins that set a GP reg to 0/1. A new MIPS_BUILTIN_MSA_TEST_BRANCH built-in type has been introduced. > > >+(define_insn "msa_branchz_v_<msafmt_f>" > >+ [(set (pc) (if_then_else > >+ (eq (unspec:SI [(match_operand:MSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_BZ_V) > >+ (match_operand:SI 2 "const_0_operand")) > >+ (label_ref (match_operand 0)) > >+ (pc)))] > >+ "ISA_HAS_MSA" > >+{ > >+ return mips_output_conditional_branch (insn, operands, > >+ MIPS_BRANCH ("bz.v", "%w1,%0"), > >+ MIPS_BRANCH ("bnz.v", "%w1,%0")); > >+} > >+ [(set_attr "type" "simd_branch") > >+ (set_attr "mode" "TI")]) > > Merge with msa_branch_nz_v_<msafmt_f>. Done. > > >+(define_expand "msa_bz_v_<msafmt_f>" > >+ [(set (match_operand:SI 0 "register_operand" "=d") > >+ (unspec:SI [(match_operand:MSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_TSTZ_V))] > >+ "ISA_HAS_MSA" > >+{ > >+ mips_expand_msa_branch (operands, gen_msa_branchz_v_<MSA:msafmt_f>); > >+ DONE; > >+}) > > Similar to msa_bnz_v_<msafmt_f>; > > >+(define_insn "msa_branchnz_<msafmt_f>" > >+ [(set (pc) (if_then_else > >+ (ne (unspec:SI [(match_operand:MSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_BNZ) > >+ (match_operand:SI 2 "const_0_operand")) > >+ (label_ref (match_operand 0)) > >+ (pc)))] > >+ "ISA_HAS_MSA" > >+{ > >+ return mips_output_conditional_branch (insn, operands, > >+ MIPS_BRANCH ("bnz.<msafmt>", "%w1,%0"), > >+ MIPS_BRANCH ("bz.<msafmt>", "%w1,%0")); > >+ > >+} > >+ > > whitespace. > > >+ [(set_attr "type" "simd_branch") > >+ (set_attr "mode" "<MODE>")]) > > As above I find the branch instructions a bit pointless unless we can use them > as actual branches. This branch could be represented as an 'IOR' of all > elements > and a test against zero. I don't know if that can be generated. These need > some more thought but it can be done as a follow on. These patterns do need > compact_form=never applying. As in the off-list discussion, the branch instructions should be done as part of a follow-on work. > > >+ > >+(define_expand "msa_bnz_<msafmt>" > >+ [(set (match_operand:SI 0 "register_operand" "=d") > >+ (unspec:SI [(match_operand:IMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_TSTNZ))] > >+ "ISA_HAS_MSA" > >+{ > >+ mips_expand_msa_branch (operands, gen_msa_branchnz_<IMSA:msafmt>); > >+ DONE; > >+}) > >+ > >+(define_insn "msa_branchz_<msafmt>" > >+ [(set (pc) (if_then_else > >+ (eq (unspec:SI [(match_operand:IMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_BZ) > >+ (match_operand:IMSA 2 "const_0_operand")) > >+ (label_ref (match_operand 0)) > >+ (pc)))] > >+ "ISA_HAS_MSA" > >+{ > >+ return mips_output_conditional_branch (insn, operands, > >+ MIPS_BRANCH ("bz.<msafmt>", "%w1,%0"), > >+ MIPS_BRANCH ("bnz.<msafmt>","%w1,%0")); > >+} > >+ [(set_attr "type" "simd_branch") > >+ (set_attr "mode" "<MODE>")]) > >+ > >+(define_expand "msa_bz_<msafmt>" > >+ [(set (match_operand:SI 0 "register_operand" "=d") > >+ (unspec:SI [(match_operand:IMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_TSTZ))] > >+ "ISA_HAS_MSA" > >+{ > >+ mips_expand_msa_branch (operands, gen_msa_branchz_<IMSA:msafmt>); > >+ DONE; > >+}) > >+ > >+;; Note that this instruction treats scalar as vector registers freely. > >+(define_insn "msa_cast_to_vector_<msafmt_f>" > >+ [(set (match_operand:FMSA 0 "register_operand" "=f") > >+ (unspec:FMSA [(match_operand:<UNITMODE> 1 "register_operand" "f")] > >+ UNSPEC_MSA_CAST_TO_VECTOR))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (REGNO (operands[0]) == REGNO (operands[1])) > >+ return "nop\t# Cast %1 to %w0"; > >+ else > >+ return "mov.<unitfmt>\t%0,%1\t# Cast %1 to %w0"; > >+} > >+ [(set_attr "type" "arith") > >+ (set_attr "mode" "TI")]) > > This is simply INSVE but implemented in a weird way. Unless you can explain > their value please delete them :-) > > >+ > >+;; Note that this instruction treats vector as scalar registers freely. > >+(define_insn "msa_cast_to_scalar_<msafmt_f>" > >+ [(set (match_operand:<UNITMODE> 0 "register_operand" "=f") > >+ (unspec:<UNITMODE> [(match_operand:FMSA 1 "register_operand" "f")] > >+ UNSPEC_MSA_CAST_TO_SCALAR))] > >+ "ISA_HAS_MSA" > >+{ > >+ if (REGNO (operands[0]) == REGNO (operands[1])) > >+ return "nop\t# Cast %w1 to %0"; > >+ else > >+ return "mov.<unitfmt>\t%0,%1\t# Cast %w1 to %0"; > >+} > >+ [(set_attr "type" "arith") > >+ (set_attr "mode" "TI")]) > > Likewise. Update vec_extract to use a simple move with a subreg instead > and leave the other patterns to sort it out. There may be a bit of complexity > to deal with here but this certainly looks like the wrong way to solve the > problem to me. I presume these were added to extract floating point result. At the time, the standard RTL (vec_select and others) wasn't used a lot or whatsoever. I removed the two above and added a msa_vec_extract_<msafmt_f> pattern to extract the element at 0 index and turn it into floating point move instruction, if needed. "extract" is probably not the best description here as it is more of a conversion between vector and scalar modes. The built-ins and info from the documentation are also removed. Regards, Robert