Re: [PATCH rs6000]Fix PR92132

Kewen.Lin Thu, 07 Nov 2019 18:39:21 -0800

Hi Segher,

on 2019/11/8 上午8:07, Segher Boessenkool wrote:
> Hi!
> 
>>> Half are pretty simple:
>>>
>>> lt(a,b) = gt(b,a)
>>> gt(a,b) = gt(a,b)
>>> eq(a,b) = eq(a,b)
>>> le(a,b) = ge(b,a)
>>> ge(a,b) = ge(a,b)
>>>
>>> ltgt(a,b) = ge(a,b) ^ ge(b,a)
>>> ord(a,b)  = ge(a,b) | ge(b,a)
>>>
>>> The other half are the negations of those:
>>>
>>> unge(a,b) = ~gt(b,a)
>>> unle(a,b) = ~gt(a,b)
>>> ne(a,b)   = ~eq(a,b)
>>> ungt(a,b) = ~ge(b,a)
>>> unlt(a,b) = ~ge(a,b)
>>>
>>> uneq(a,b) = ~(ge(a,b) ^ ge(b,a))
>>> un(a,b) = ~(ge(a,b) | ge(b,a))
>>
>> Awesome!  Do you suggest refactoring on them?  :)
> 
> I'd do the first five in one pattern (which then swaps two ops and the
> condition in the lt and le case), and the other five in another pattern.
> And the rest in two or four patterns?  Just try it out, see what works
> well.  It helps to do a bunch together in one pattern, but if that then
> turns into special cases for everything, more might be lost than gained.>


Got it, I'll make a refactoring patch for this part later.

> 
>>> 8 codes, ordered:    never     lt   gt   ltgt eq   le   ge   ordered
>>> 8 codes, unordered:  unordered unlt ungt ne   uneq unle unge always
>>> 8 codes, fast-math:  never     lt   gt   ne   eq   le   ge   always
>>> 8 codes, non-fp:     never     lt   gt   ne   eq   le   ge   always
>>
>> Sorry, I don't quite follow this table.  What's the column heads?
> 
> The first row is the eight possible fp conditions that are not always
> true if unordered is set; the second row is those that *are* always true
> if it is set.  The other two rows (which are the same) is just the eight
> conditions that do not test unordered at all.
> 
> The tricky one is "ne": for FP *with* NaNs, "ne" means "less than, or
> greater than, or unordered", while without NaNs (i.e. -ffast-math) it
> means "less than, or greater than".
> 
> You could write the column heads as
> --/--/--  lt/--/--  --/gt/--  lt/gt/--  --/--/eq  lt/--/eq  --/gt/eq  lt/gt/eq
> if that helps?  Just the eight combinations of the first free flags.
> 

Thanks a lot for the explanation.  It's helpful!
 

>> +;; For signed integer vectors comparison.
>> +(define_expand "vec_cmp<mode><mode>"
>> +  [(set (match_operand:VEC_I 0 "vint_operand")
>> +    (match_operator 1 "signed_or_equality_comparison_operator"
>> +      [(match_operand:VEC_I 2 "vint_operand")
>> +       (match_operand:VEC_I 3 "vint_operand")]))]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> +{
>> +  enum rtx_code code = GET_CODE (operands[1]);
>> +  rtx tmp = gen_reg_rtx (<MODE>mode);
>> +  switch (code)
>> +    {
>> +    case NE:
>> +      emit_insn (gen_vector_eq<mode> (operands[0], operands[2], 
>> operands[3]));
>> +      emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
>> +      break;
>> +    case EQ:
>> +      emit_insn (gen_vector_eq<mode> (operands[0], operands[2], 
>> operands[3]));
>> +      break;
>> +    case GE:
>> +      emit_insn (gen_vector_nlt<mode> (operands[0],operands[2], operands[3],
>> +                                   tmp));
>> +      break;
>> +    case GT:
>> +      emit_insn (gen_vector_gt<mode> (operands[0], operands[2], 
>> operands[3]));
>> +      break;
>> +    case LE:
>> +      emit_insn (gen_vector_ngt<mode> (operands[0], operands[2], 
>> operands[3],
>> +                                   tmp));
>> +      break;
>> +    case LT:
>> +      emit_insn (gen_vector_gt<mode> (operands[0], operands[3], 
>> operands[2]));
>> +      break;
>> +    default:
>> +      gcc_unreachable ();
>> +      break;
>> +    }
>> +  DONE;
>> +})
> 
> I would think this can be done easier, but it is alright for now, it can
> be touched up later if we want.
> 
>> +;; For float point vectors comparison.
>> +(define_expand "vec_cmp<mode><VEC_int>"
> 
> This, too.
> 
>> +  [(set (match_operand:<VEC_INT> 0 "vint_operand")
>> +     (match_operator 1 "comparison_operator"
> 
> If you make an iterator for this instead, it is simpler code (you can then
> use <code> to do all these cases in one statement).

If my understanding is correct and based on some tries before, I think we
have to leave these **CASEs** there (at least at the 1st level define_expand
for vec_cmp*), since vec_cmp* doesn't have <code> field in the pattern name.
The code can be only extracted from operator 1.  I tried to add one dummy
operand to hold <code> but it's impractical.

Sorry, I may miss something here, I'm happy to make a subsequent patch to
uniform these cases if there is a good way to run a code iterator on them.

> 
> But that can be done later.  Okay for trunk.  Thanks!
> 

Many thanks for your time!


BR,
Kewen

Re: [PATCH rs6000]Fix PR92132

Reply via email to