Re: Truncate optimisation question

Richard Sandiford Tue, 03 Dec 2013 11:11:10 -0800

Eric Botcazou <ebotca...@adacore.com> writes:
>> For code:
>> 
>> unsigned char foo (unsigned char c)
>> {
>>    return (c >= '0') && (c <= '9');
>> }
>> 
>> we end up generating:
>> 
>>          sub     r0, r0, #48
>>          uxtb    r0, r0
>>          cmp     r0, #9
>>          movhi   r0, #0
>>          movls   r0, #1
>>          bx      lr
>> 
>> The extra uxtb (extend) is causing the test to fail. We started generating
>> the extra extend when a particular optimisation went in with (revision
>> r191928).
>
> That's PR rtl-optimization/58295.  We have pessimizations on the SPARC too.
>
>> The comment in simplify-rtx.c says it transforms
>> (truncate:SI (op:DI (x:DI) (y:DI)))
>> 
>> into
>> 
>> (op:SI (truncate:SI (x:DI)) (truncate:SI (x:DI)))
>> 
>> but from what I can see it also transforms
>> 
>> (truncate:QI (op:SI (x:SI) (y:SI)))
>> 
>> into
>> 
>> (op:QI (truncate:QI (x:SI)) (truncate:QI (x:SI)))
>> 
>>  From the combine dump I see that the sub and extend operations come from
>> the RTL:
>> 
>> (insn 6 3 7 2 (set (reg:SI 116)
>>          (plus:SI (reg:SI 0 r0 [ c ])
>>              (const_int -48 [0xffffffffffffffd0])))
>> 
>> (insn 7 6 8 2 (set (reg:SI 117)
>>          (zero_extend:SI (subreg:QI (reg:SI 116) 0)))
>> 
>> 
>> If I add a QImode compare pattern to the arm backend it gets matched and the
>> extra extend goes away, but it seems to me that that's not the correct
>> solution. Ideally, if a QImode operation is performed as an SImode
>> operation on a target (like the sub and compare operations on arm) then we
>> should not be doing this optimisation?
>
> Yes, that's my opinion as well, but Richard (CCed) seemed to disagree.  In 
> any 
> case, that's certainly not a simplification.
>
>> My question is, how does one express that information in the simplify-rtx.c
>> code? It seems that the PR that optimisation fixed (54457) only cared about
>> DI -> SI truncations, so perhaps we should disable it for conversions
>> between other modes where it's not beneficial altogether?
>
> I can think of 3 possible solutions:
>  - WORD_REGISTER_OPERATIONS,
>  - promote_mode,
>  - optabs.
>
> The 3rd solution seems to be the most straightforward, but this would be the 
> first time that we test optabs from simplify-rtx.c.


I don't think this is the way to go.  AIUI the problem here isn't that
RISC architectures don't have QImode adds as such.  If we were going
to combine insn 6 and insn 7 _in isolation_ then we would have either:

   (zero_extend:SI (subreg:QI (plus:SI (subreg:QI (reg:SI R))
                                       (const_int X))))

before the patch or:

   (zero_extend:SI (plus:QI (reg:QI R)
                            (const_int X)))

after the patch.  And IMO the second form is a simplification over the first.
(Despite being a RISC port, MIPS Octeon does have a pattern for
zero-extending byte addition, so this isn't entirely academic.)

It sounds like the problem is instead that we're relying on combine
to remove redundant zero extensions and that combine's no longer able
to do that when folding insns 6 and 7 into the comparison.  Is that right?
I think it would better to use a dedicated global pass to remove redundant
extensions instead.  IIRC there were various attempts to do that.

IMO combine should just be about instruction selection.  The patch
still seems good to me from that POV.

Thanks,
Richard

Re: Truncate optimisation question

Reply via email to