Richard Sandiford <richard.sandif...@arm.com> writes: >> ;; Predicated float-to-integer conversion, either to the same width or >> wider. >> (define_insn >> "@aarch64_sve_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>" >> [(set (match_operand:SVE_FULL_HSDI 0 "register_operand") >> @@ -9517,18 +9537,34 @@ >> } >> ) >> >> +;; As above, for pairs used by the auto-vectorizer only. >> +(define_insn >> "*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>" >> + [(set (match_operand:SVE_HSDI 0 "register_operand") >> + (unspec:SVE_HSDI >> + [(match_operand:<SVE_HSDI:VPRED> 1 "aarch64_predicate_operand") >> + (match_operand:SI 3 "aarch64_sve_gp_strictness") >> + (match_operand:SVE_PARTIAL_F 2 "register_operand")] >> + SVE_COND_FCVTI))] >> + "TARGET_SVE >> + && (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & >> <SVE_PARTIAL_F:self_mask>) == 0" >> + {@ [ cons: =0 , 1 , 2 ; attrs: movprfx ] >> + [ w , Upl , 0 ; * ] >> fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype> >> + [ ?&w , Upl , w ; yes ] movprfx\t%0, >> %2\;fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype> >> + } >> +) >> + >> ;; Predicated narrowing float-to-integer conversion. > > I think it would be worth extending the comment here, in case it isn't > obvious what's going on: > > ;; Predicated narrowing float-to-integer conversion. The VNx2DF->VNx4SI > ;; variant is provided for the ACLE, where the zeroed odd-indexed lanes are > ;; significant. The VNx2DF->VNx2SI variant is provided for autovectorization, > ;; where the odd-indexed lanes are ignored.
I suppose I should have said "where the upper 32 bits of each container are ignored".