Re: gimple type system

Richard Guenther Sat, 19 Jul 2008 03:07:44 -0700

On Sat, Jul 19, 2008 at 12:45 AM, Kenneth Zadeck
<[EMAIL PROTECTED]> wrote:
> Richard Guenther wrote:
>>
>> On Fri, Jul 18, 2008 at 11:25 PM, Kenneth Zadeck
>> <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> Diego has asked me to look into what would be needed in a gimple type
>>> system.   This is an issue that has been brought to a head because now it
>>> is
>>> time to merge types for lto.
>>>
>>> There are a lot of questions that need to be answered before designing
>>> such
>>> a system and i would like to handle them one by one, rather than deal
>>> with a
>>> thousand threads that go off in a lot of directions.  So for now, I would
>>> like to limit the discussion to a single question:   "what do we want to
>>> do
>>> in the middle end of a compiler with a middle end type system?"
>>>
>>> I have a couple of positive answers and one negative answer.  The point
>>> of
>>> this mail is to get a more refined list.  The two positive answers are:
>>>
>>> 1) Type narrowing.   In an object oriented system, it is generally a big
>>> win
>>> to be able to narrow a type as much as possible.   This can be used to
>>> then
>>> be able to inline method calls, as well as remove runtime casts and type
>>> checks (this is useless for c).
>>> 2) Inter file type checking.  While this is not an optimization, there
>>> are
>>> reasons why it would be useful to discover types that are mismatched
>>> across
>>> compilation units.
>>>
>>> The thing that MAY not be useful anymore is the use of a type system of
>>> alias analysis.   I would have hoped that danny and richi and all of the
>>> other people hacking on the alias analysis would have subsumed anything
>>> that
>>> one could have gathered from a type based alias analysis.  If I am wrong,
>>> please correct me.
>>>
>>
>> Hah.  You are definitely wrong.
>>
>>
>
> I stand corrected.  I could hope.


Low hanging improvements are still possible.  That said, I'm still fighting out
if we can have some precision at all in the virtual FUD chains.

>>> Anyway, there must be other uses of types in either the existing middle
>>> end
>>> or that people have dreams of adding to the middle end in the future.
>>> Now
>>> is the time to raise your hand before the design has been started.
>>>
>>
>> We already have a middle-end type-system.  It is unfortunately mostly
>> implicitly defined by the assumptions we make and the information we
>> extract from the types.
>>
>> I don't think you can just go and define a type-system - after all, what
>> would
>> be the result of such a definition?
>>
>>
>
> Here, I think that you are wrong.  You can first start with a series of
> requirements and only keep the information that is necessary to satisfy the
> requirements.    We currently have a type system that is defined by: here is
> what the c front end used to provide for use, and then we grafted on what
> the front ends would provide for use and now we have a mess.
>
> It is my hope to define what what we really need and then to define an api
> to get that information and then engineer the underlying information to make
> this all work and I would like to know why we want to merge and what it
> means to merge (except for the obvious space reason) before i go proposing
> some algorithms and datastructures to support this.
>>
>> Instead there are some goals we want to reach:
>> 1) reduce the amount of data related to types (and declarations)
>> during optimization
>> 2) canonicalize "types" if their difference does not matter for the
>> current and further
>> stages of compilation (like we do now say all integral types with the
>> same precision
>> and signedness are "the same")
>>
>> For both of the above a prerequesite to actually really "unify" types
>> (not treating
>> them the same, but actually getting rid of either in favor of another)
>> is to handle
>> debug information properly.  The plan is to emit debug information for the
>> source-level state of types early and refer to that from declarations via
>> a
>> unique identifier.
>>
>> One question would be if we want to gradually lower types (for example
>> flatten
>> structures, unify bit-precision integer types to modes, etc.) or if we
>> just want to
>> do one step after/during gimplification (we have a second step at RTL
>> expansion
>> of course).
>>
>>
>
> I would like to settle this lowering issue after i get an inventory of the
> uses we want to make of the type information.

Ok, I see two^Wthree classes of uses:

 1) debug information - this eventually needs frontend specific details not
yet available in the current middle-end "interface"

 2) type-based alias disambiguation - likewise this either has frontend-specific
parts or touches the type-merging problem

 3) information necessary for optimization purposes.  Trivial stuff like
signedness, precision and kind, but also possibly value-range information.
For aggregates I don't see that we need more than a flat representation
containing offset, size pairs.
It get's interesting if we want to play games with things like virtual functions
or virtual inheritance.  But that's probably interesting for high-level IPA
transforms only, so required information can be dropped at some point.  This
is also information that is much harder to "merge" for inter-language LTO.

What we need to make use of 3) is a way to tell if two "different" types can
possibly refer to the same data, that is - a type compatibility check.
 Currently
we provide that for all but aggregates from the middle-end
useless_type_conversion_p
function.  For aggregates we defer to the frontend or use the TYPE_CANONICAL,
TYPE_MAIN_VARIANT and type equality by means of sharing types.  As
TYPE_CANONICAL falls back to structural type comparisons a compatibility
check based on the flat representation of an aggregate should work for all
languages (but of course misses distinctions the language semantics provide).

So, we should be able to drop the frontend-specific type information (basically
what is encoded in the trees) and build "type tuples".  Of course the thing
blocking this is debug information.  Thus I suggest somebody works on
implementing
the "early debug information for types and decls" idea.

Richard.

Re: gimple type system

Reply via email to