Jakub Jelinek wrote:
> On Fri, Dec 09, 2011 at 01:50:37PM +0100, Georg-Johann Lay wrote:
>> No, not OK.
>>
>> This leads to unacceptable code for devices that cannot shift easily like, 
>> e.g.
>> AVR. This target can only shift by 1 and shifts with big offsets have to be
>> performed by means of a loop at runtime.
> 
> Andrew's patch only restored what GCC has been doing before.
> If this is too expensive on AVR, you should just arrange in the backend to
> combine that
> ((x>>C)&1) != 0
> into more efficient code sequence, otherwise people who write
> if ((x >> 18) & 1)
>   bar ();
> in the source will get suboptimal code.
> 
>       Jakub

Still don't agree.

If the backend is the right place to do such optimizations, why don't other
targets do it there. Instead, their maintainers make assumptions on costs or
instructions available that don't apply for all targets supported by GCC.

There are hundreds of ways C code can write down this code.

* if (a & (1 << bit)) return 1; else return 0;
* if (a & (1 << bit)) return 1; return 0;
* return (a & (1 << bit)) ? 1 : 0;
* return ((a >> bit) & 1) ? 1 : 0;
* a >>= bit; return a & 1;
* return bitfield.bit;

I don't think cluttering up the backend with myriads of combine patterns is the
right way to approach this. Notice that above are just simple examples; if
there is more context or other operators like !, |, & things will get even 
worse.

Just imagine the various ways in C you can write down extracting bitx from val1
and insert it as bit_y in val2: You can user ternary expression,
shift-and-mask-and-not-or, if-else construct, bitfields, or combinatiopn of 
them.

The right way to handle is:

1) Canonicalize such code like
 - testing a bit
 - moving a bit
 - ...
 and canonicalize it early so that tree optimizers know what to look for

2) Upon tree -> RTL lowering look at the target's prefered way to
   accomplish this. The preferred way can be deduced from RTX costs.

This is pretty much straight forward, and I don't understand the problems with
- canonicalize stuff
- optimize on canonicalized representation
- lower canonicalized representation to best RTL

Johann

Reply via email to