Hi, in general LGTM, just minor nits and comments.
> - void set_len_and_policy (rtx len, bool force_vlmax = false) > - { > - bool vlmax_p = force_vlmax; > - gcc_assert (has_dest); > + void set_len_and_policy (rtx len, bool force_vlmax = false, bool ta_p = > true, > + bool ma_p = true) > + { > + bool vlmax_p = force_vlmax; > + gcc_assert (has_dest); Indentation? > m_inner_mode = GET_MODE_INNER (mode); > - m_inner_size = GET_MODE_BITSIZE (m_inner_mode).to_constant (); > + m_inner_size = GET_MODE_BITSIZE (m_inner_mode); > + m_inner_units = GET_MODE_SIZE (m_inner_mode); I find it a bit misleading to call this units here. Granted it's an inner mode (i.e. referring to "bytes") but in the context of vector modes I'm likely to think of a vector "unit" or lane. What about m_inner_size_bytes or m_inner_size_units? > +bool > +rvv_builder::repeating_sequence_use_merge_profitable_p () > +{ > + return repeating_sequence_p (0, full_nelts ().to_constant (), npatterns ()) > + && inner_units () <= UNITS_PER_WORD > + && 3 * npatterns () < full_nelts ().to_constant (); > +} Appreciate the explanatory comment and number of instructions is good for now. In the future and given the different uarchs we will want a proper costing comparison. > +/* Get the mask for merge approach. > + > + Consider such following case: > + {a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b} > + To merge "a", the mask should be 1010.... > + To merge "a", the mask should be 0101.... > +*/ Second line should be "b". > +/* Emit merge instruction. */ > + > +static void > +emit_merge_op (rtx dest, rtx src1, rtx src2, rtx mask) > +{ > + insn_expander<8> e; > + machine_mode mode = GET_MODE (dest); > + e.set_dest_and_mask (NULL_RTX, dest, GET_MODE (mask), true, true); > + e.add_input_operand (src1, mode); > + if (VECTOR_MODE_P (GET_MODE (src2))) > + e.add_input_operand (src2, mode); > + else > + e.add_input_operand (src2, GET_MODE_INNER (mode)); > + > + e.add_input_operand (mask, GET_MODE (mask)); > + e.set_len_and_policy (NULL_RTX, true, true, false); > + if (VECTOR_MODE_P (GET_MODE (src2))) > + e.expand (code_for_pred_merge (mode), false); > + else > + e.expand (code_for_pred_merge_scalar (mode), false); > +} Looks a lot like binop. Might need another round of wrappers soon :) Regards Robin