Re: Imprecise data flow analysis leads to code bloat

Georg-Johann Lay Fri, 18 Jan 2013 03:48:54 -0800

Richard Biener wrote:
> On Thu, Jan 17, 2013 at 6:04 PM, Georg-Johann Lay wrote:
>> Richard Biener wrote:
>>> On Thu, Jan 17, 2013 at 12:20 PM, Georg-Johann Lay wrote:
>>>> Hi, suppose the following C code:
>>>>
>>>>
>>>> static __inline__ __attribute__((__always_inline__))
>>>> _Fract rbits (const int i)
>>>> {
>>>>     _Fract f;
>>>>     __builtin_memcpy (&f, &i, sizeof (_Fract));
>>>>     return f;
>>>> }
>>>>
>>>> _Fract func (void)
>>>> {
>>>> #if B == 1
>>>>     return rbits (0x1234);
>>>> #elif B == 2
>>>>     return 0.14222r;
>>>> #endif
>>>> }
>>>>
>>>>
>>>> Type-punning idioms like in rbits above are very common in libgcc, for 
>>>> example
>>>> in fixed-bit.c.
>>>>
>>>> In this example, both compilation variants are equivalent (provided int and
>>>> _Fract are 16 bits wide).  The problem with the B=1 variant is that it is
>>>> inefficient:
>>>>
>>>> Variant B=1 needs 2 instructions.
>>> B == 1 shows that fold-const.c native_interpret/encode_expr lack
>>> support for FIXED_POINT_TYPE.
>>>
>>>> Variant B=2 needs 11 instructions, 9 of them are not needed at all.
>> I confused B=1 and B=2.  The inefficient case with 11 instructions is B=1, of
>> course.
>>
>> Would a patch like below be acceptable in the current stage?
> 
> I'd be fine with it (not sure about test coverage).  But please also add
> native_encode_fixed.


Yes, of course.  Just wanted to know if the change is in order in principle.

As far as test cases are concerned: Is, for instance, __xFRACT_EPSILON__ always
represented as 1 if the bits are regarded as integer?  What's with padded
fixed-points as mentioned below?

>> It's only the native_interpret and pretty much like the int case.
>>
>> Difference is that it rejects if the sizes don't match exactly.
> 
> Hmm, yeah.  I'm not sure why the _interpret routines chose to ignore
> tail padding ... was there any special correctness reason you did it
> differently than the int variant?

Well, I am interested in this optimization.  But I am also interested in
learning (by doing) about GCC, which means I am unsure about most corners of 
GCC.

In this case I don't know why padding should occur in the first place because
view_convert_expr only allows same-size transformations.

Moreover, I am unsure about padding in a fixed-point itself.  Mode definition
mumbles something about possible padding in the type, but the compiler only
allows to set IBIT, FBIT and the mode size.

Now suppose a 32-bit, little-endian target and an 8-bit like QQ.  The target
want to pad the QQ in such a way that it is at the high end of the register,
i.e. the QQ is stored as 8-bit value, but when loaded to a 32-bit register it
shall be loaded at the high end.

How would one express this in GCC? Obviously, implementing the QQ insns
appropriately is not enough because libgcc needs to know the layout.  Moreover,
it should even work without insns if everything is lowered in int operations by
optabs.

>>  A new function
>> is used for low-level construction of a const_fixed_from_double_int.  Isn't
>> there a better way?  I wonder that such functionality is not already there...
> 
> Good question - there are probably more places that could make use of
> this.

Is there a specific reason for why native_interpret_int looks like it does?
Historical reasons? Performance?

I'd like to move most of the buffer encode / decode stuff to double_int so that
these internals are inside the double_int.  int and fixed cases would be tidied
up and be clearer, but it introduced overhead.

The code will be effectively serialization / deserialization with some special
treatment for endianess.  Isn't such code already present for LTO or PCH or
similar?


Johann

Re: Imprecise data flow analysis leads to code bloat

Reply via email to