Hi Segher,

on 2022/11/18 23:10, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Nov 17, 2022 at 02:59:00PM +0800, Kewen.Lin wrote:
>> on 2022/11/17 02:44, Segher Boessenkool wrote:
>>> On Wed, Nov 16, 2022 at 02:48:25PM +0800, Kewen.Lin wrote:
>>>>    * config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): Remove
>>>>    float only comparison operators.
>>>
>>> Why?  Is that correct?  Your mail says nothing about this :-(
>>>
>>> Is there any testcase that covers this, and that shows things still
>>> generate the same code?
>>>
>>
>> Sorry for the unclear description, I thought mistakenly that it's
>> probably straightforward.
>>
>> With the change in this patch, all 14 vector float comparison operators
>> (unordered/ordered/eq/ne/gt/lt/ge/le/ungt/unge/unlt/unle/uneq/ltgt)
>> would be handled early in rs6000_emit_vector_compare.
>>
>> For unordered/ordered/ltgt/uneq, the new way is exactly the same
>> as what we do in rs6000_emit_vector_compare_inner, it means there is
>> no chance to get into rs6000_emit_vector_compare_inner with any of them.
> 
> Ah!  In that case, please add an assert there.  It helps catch problems,
> but much more importantly even, if helps the reader understand what is
> going on :-)

Good idea, will do.

> 
>> For eq/ge/gt, it's the same too, but they are shared with vector integer
>> comparison, I just left them alone here.  Just noticed we can remove ge
>> safely too as it's guarded with !MODE_VECTOR_INT.
> 
> ge is nasty for float, it means something different with and without
> -ffast-math (with fast-math ge means not lt, le means not gt; both can
> be done with a simple single condition, no cror needed.  (Compare to ne
> which is the same with and without -ffast-math, that is because it has a
> "not" in its definition!)
> 

It's true for scalar float comparison, but the context here is for vector
comparison, the result of comparison is still vector (of boolean), and we
have the corresponding vector comparison instruction for ge, so I think it
should be fine here.

>> For ne/ungt/unlt/unge/unle, rs6000_emit_vector_compare changes the code
>> with reverse_condition_maybe_unordered and invert the result, it's the
>> same as what we have in vector.md.
>>
>> ; unge(a,b) = ~lt(a,b)
>> ; unle(a,b) = ~gt(a,b)
>> ; ne(a,b)   = ~eq(a,b)
>> ; ungt(a,b) = ~le(a,b)
>> ; unlt(a,b) = ~ge(a,b)
> 
> But for these last two do we generate identical code still?  Since
> forever we have only use cror here (with CCEQ), not crnor etc. (and will
> CCEQ still do the correct thing always then?)

For ge (~ge), yes; while for le (~le), it's not, as explained previously below.

> 
>> Then eq/ge/gt on the right side would match the cases that were mentioned
>> above.  So we just need to focus on lt and le then.
>>
>> For lt, rs6000_emit_vector_compare swaps operands and the operator to gt,
>> it's the same as what we have in vector.md:
>>
>> ; lt(a,b)   = gt(b,a)
>>
>> , and further matches the case mentioned above.
>>
>> As to le, rs6000_emit_vector_compare tries to split it into lt IOR eq,
>> and further handle lt recursively, that is:
>>    le = lt(a,b) || eq(a,b)
>>       = gt(b,a) || eq(a,b)
>>
>> actually this is worse than what vector.md supports:
>>
>> ; le(a,b)   = ge(b,a)
>>
>> In short, the function rs6000_emit_vector_compare_inner is only called by
>> twice in rs6000_emit_vector_compare, there is no chance to enter
>> rs6000_emit_vector_compare_inner with codes unordered/ordered/ltgt/uneq
>> any more, I think it's safe to make the change in function
>> rs6000_emit_vector_compare_inner.  Besides, the proposed way to handle
>> vector float comparison can improve slightly for UNGT and LE handlings.
> 
> Thanks for the explanation!
> 
> Can you do this in multiple steps, which will make it much easier to
> review, and to spot the problem if some unexpected problem shows up?

Sure, I'll try my best to separate it into some steps and show how it
evolves gradually.

> 
>> I constructed a test case, compiled with option -O2 -ftree-vectorize
>> -fno-vect-cost-model on ppc64le, which goes into this function
>> rs6000_emit_vector_compare with all 14 vector float comparison codes,
>> the assembly of most functions doesn't change after this patch,
>> excepting for test_UNGT_{float,double} and test_LE_{float,double}.
> 
> For, this is a separate change, a separate and the other patches will
> show no changes in generated code at all.

Good point, will separate it.

> 
>> Maybe it's good to add one test case with function 
>> test_{UNGT,LE}_{float,double}
>> and scan not xvcmp{gt,eq}[sd]p.
> 
> In the patch that changes code gen for those, sure :-)
> 

Thanks for all the comments again.

BR,
Kewen

Reply via email to