To take full advantage of the conversion to C++, we will need to use
single inheritance in some of our garbage collected structures.  To
that end, we need to make gengtype understand single inheritance.
Here are my thoughts on how to make that happen.

There are two major sections, one for non-polymorphic classes and one
for polymorphic classes.  Each section has a series of subsection
pairs describing a new grammar and its implementation.


NON-POLYMORPHIC CLASSES

The classes we care about are polymorphic in the general sense,
but not in the language sense, because they don't have virtual
members or bases.  That is, they use ad-hoc polymorphism via an enum
discriminator to control casting.

enum E { EA, EB, EC, ED, EE };


GRAMMAR

The root class class must have a discriminator.

class ATYPE GTY ((desc ("%h.type")))
{ public: enum E type; other *pa; ... };

No derived class may have a discriminator.

Every allocatable class must have a tag.

class BTYPE : GTY ((tag ("EB"))) public ATYPE
{ public: other *pb; ... };

The root class may be allocatable, and so may have a tag.

class ATYPE GTY ((desc ("%h.type"), tag ("EA")))
{ public: enum E type; other *pa; ... };

Two derived classes may share a base, even indirectly, but be
otherwise unrelated.

class CTYPE : GTY ((tag ("EC"))) public BTYPE { ... };
class DTYPE : GTY ((tag ("ED"))) public BTYPE { ... };
class ETYPE : GTY ((tag ("EE"))) public ATYPE { ... };

Note the difference between C and D are siblings but D and E are
siblings once removed (i.e. nephew and uncle).

Private and protected fields are not supported, at least for those
fields that gengtype cares about.


IMPLEMENTATION

We can probably hack into the existing generation for unions.
However, there is a naming problem.  The current gcc union approach
creates a "top type" for every class hierarchy.  A single-inheritance
class hierarchy has no "top type" and so we cannot use it to name the
marker.  We can use the base type, (aka "bottom type").

That is, generate gt_ggc_mx_ATYPE full of a switch statement covering
all known deriviations.  Any gt_ggc_mx names for derived types would
be simply aliased to the base.


GRAMMAR

Add support for protected and private data fields.

class ATYPE GTY ((desc ("%h.type")))
{ protected: enum E type; other *pa; ... };

However, gt_ggc_mx_ATYPE no longer has access to the fields.

We can make the user declare gt_ggc_mx_ATYPE a friend of ATYPE and
all derived classes.  That extra declaration in every class is a mild
burden, as it requires following the chain of bases to determine the
root class name and then keying in the declaration.

class ATYPE GTY ((desc ("%h.type")))
{ friend void gt_ggc_mx_ATYPE(void*);
  protected: enum E type; other *pa; ... };


IMPLEMENTATION

No new implementation is required.


GRAMMAR

Making gt_ggc_mx_ATYPE a friend of all derived classes is possible
but far from ideal because the name is nonobvious and refers to
implementation details that should be hidden.

Instead, have the class itself declare, but not define, a function to
mark its members.  Gengtype will create the function.

The gt_ggc_mx_ATYPE function need not be declared a friend as long as
the tag itself is public.  Otherwise, for the moment, we still need
the friend declaration.

class BTYPE GTY ((tag ("EB"))) : public ATYPE
{ friend void gt_ggc_mx_ATYPE(void*);
  public: void ggc_marker() const;
  private: other *pb; ... };


IMPLEMENTATION

The mark function needs to be split into parts.  One part contains the
dispatcher; it retains the current name, void gt_ggc_mx_ATYPE(void*).
The remaining parts are member functions that mark the members of an
allocatable class.  Gengtype fills out both functions.  The marker
function for a class will call the marker function for its base.


GRAMMAR

Support adding a second discriminator.  This support is not for
multiple inheritance, but for single inheritance when a second
discriminator is used to further refine it.  Look at struct
tree_omp_clause.  It contains a sub union.  We can represent the
hierarchy like:

struct tree_omp_clause : tree_common {
  location_t locus;
  enum omp_clause_code code;
};

struct tree_omp_default_clause : tree_omp_clause {
  enum omp_clause_default_kind default_kind;
};

struct tree_omp_schedule_clause : tree_omp_clause {
  enum omp_clause_schedule_kind schedule_kind;
};

struct tree_omp_reduction_clause : tree_omp_clause {
  enum tree_code reduction_code;
};

We use TREE_CODE to understand that we have at least a tree_omp_clause
and then we use tree_common.code to to distinguish these last three.

Another possible case is tree_type_symtab inside tree_type_common.

The syntax would be something like the following.

enum F { F1, F2, F3, F4, F5 };

class CTYPE GTY ((desc ("%h.kind"), tag ("F1")))
: GTY ((tag ("EC"))) public BTYPE
{ public: enum F kind; something *pq; ... };

class FTYPE : GTY ((tag ("F2"))) public CTYPE { ... };


IMPLEMENTATION

The dispatcher will have nested switch statements.


GRAMMAR

Use an accessor for the tag information.  The base class provides a
public 'getter' for the tag.  Gengtype uses that getter.  No friend
declaration is required.  No implementation details are exposed.

class ATYPE GTY ((desc ("%h.get_type()")))
{ public: enum E get_type() { return type; }
  protected: enum E type; other *pa; ... };


IMPLEMENTATION

Perhaps nothing needs to be done.



POLYMORPHIC CLASSES

For polymorphic classes, we can and should use the virtual table to
dispatch the marker.  No discriminator in enum is required.


GRAMMAR

The user declares a mark virtual mark method.

class ATYPE GTY ((virtual))
{ public: virtual void ggc_marker() const;


IMPLEMENTATION

Gengtype fills in the body of the method with the marking of members.
It also makes a non-virtual call to the markers for the bases.  (The
non-virtual call is needed to prevent the marking of base classes from
immediately bouncing back to the derived class.)


ISSUE

This approach will not work with the standard library because it does
not want to expose declarations for GCC's collector.  I begin to
suspect that any solution using the standard library will require a
first-class implementation within the compiler itself, as the compiler
has full access to fields and can bypass any privacy.

-- 
Lawrence Crowl

Reply via email to