Richard Sandiford <richard.sandif...@arm.com> writes:
>>  ;; Predicated float-to-integer conversion, either to the same width or 
>> wider.
>>  (define_insn 
>> "@aarch64_sve_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>"
>>    [(set (match_operand:SVE_FULL_HSDI 0 "register_operand")
>> @@ -9517,18 +9537,34 @@
>>    }
>>  )
>>  
>> +;; As above, for pairs used by the auto-vectorizer only.
>> +(define_insn 
>> "*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>"
>> +  [(set (match_operand:SVE_HSDI 0 "register_operand")
>> +    (unspec:SVE_HSDI
>> +      [(match_operand:<SVE_HSDI:VPRED> 1 "aarch64_predicate_operand")
>> +       (match_operand:SI 3 "aarch64_sve_gp_strictness")
>> +       (match_operand:SVE_PARTIAL_F 2 "register_operand")]
>> +      SVE_COND_FCVTI))]
>> +   "TARGET_SVE
>> +   && (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & 
>> <SVE_PARTIAL_F:self_mask>) == 0"
>> +  {@ [ cons: =0 , 1   , 2 ; attrs: movprfx ]
>> +     [ w        , Upl , 0 ; *              ] 
>> fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype>
>> +     [ ?&w      , Upl , w ; yes            ] movprfx\t%0, 
>> %2\;fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype>
>> +  }
>> +)
>> +
>>  ;; Predicated narrowing float-to-integer conversion.
>
> I think it would be worth extending the comment here, in case it isn't
> obvious what's going on:
>
> ;; Predicated narrowing float-to-integer conversion.  The VNx2DF->VNx4SI
> ;; variant is provided for the ACLE, where the zeroed odd-indexed lanes are
> ;; significant.  The VNx2DF->VNx2SI variant is provided for autovectorization,
> ;; where the odd-indexed lanes are ignored.

I suppose I should have said "where the upper 32 bits of each container
are ignored".

Reply via email to