on 2019/7/17 下午4:42, Jakub Jelinek wrote: > On Wed, Jul 17, 2019 at 04:32:15PM +0800, Kewen.Lin wrote: >> --- a/gcc/config/rs6000/vector.md >> +++ b/gcc/config/rs6000/vector.md >> @@ -1260,6 +1260,32 @@ >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" >> "") >> >> +;; Expanders for rotatert to make use of vrotl >> +(define_expand "vrotr<mode>3" >> + [(set (match_operand:VEC_I 0 "vint_operand") >> + (rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand") >> + (match_operand:VEC_I 2 "vint_reg_or_const_vector")))] >> + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" >> +{ >> + machine_mode inner_mode = GET_MODE_INNER (<MODE>mode); >> + unsigned int bits = GET_MODE_PRECISION (inner_mode); >> + rtx imm_vec = gen_const_vec_duplicate (<MODE>mode, GEN_INT (bits)); >> + rtx rot_count = gen_reg_rtx (<MODE>mode); >> + if (GET_CODE (operands[2]) == CONST_VECTOR) >> + { >> + imm_vec = simplify_const_binary_operation (MINUS, <MODE>mode, imm_vec, >> + operands[2]); >> + rot_count = force_reg (<MODE>mode, imm_vec); >> + } >> + else >> + { >> + rtx imm_reg = force_reg (<MODE>mode, imm_vec); >> + emit_insn (gen_sub<mode>3 (rot_count, imm_reg, operands[2])); >> + } > > Is this actually correct if one or more elements in operands[2] are 0? > If vrotl<mode>3 acts with truncated shift count, that is not an issue > (but then perhaps you wouldn't have to compute imm_reg - operands[2] but > just - operands[2]), but if it does something else, then prec - 0 will be > prec and thus outside of the allowed rotate count. Or does rs6000 allow > rotate counts to be 0 to prec inclusive? > > Jakub >
Hi Jakub, Good question, the vector rotation for byte looks like (others are similar): vrlb VRT,VRA,VRB do i=0 to 127 by 8 sh = (VRB)[i+5:i+7] VRT[i:i+7] = (VRA)[i:i+7] <<< sh end It only takes care of the counts from 0 to prec-1 (inclusive) [log2(prec) bits] So it's fine even operands[2] are zero or negative. Take byte as example, prec is 8. - rot count is 0, then minus res gets 8. (out of 3 bits range), same as 0. - rot count is 9, then minus res gets -1. (3 bits parsed as 7), the original rot count 9 was parsed as 1 (in 3 bits range). - rot count is -1, then minus res gets 9, (3 bits parsed as 1), the original rot count was parsed as 7 (in 3 bits range). It's a good idea to just use negate! Thanks!! Kewen