On Tue, 2012-04-03 at 15:23 +0200, Richard Guenther wrote: > On Tue, Apr 3, 2012 at 12:03 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: > > On Mon, Apr 2, 2012 at 7:21 PM, David Malcolm <dmalc...@redhat.com> wrote: > >> I wrote a script and ported my proposed API for GCC plugins from my > >> CamelCase naming convention to an underscore_based_convention (and > >> manually fixed up things in a few places also). > >> > >> The result compiles and passes the full test suite for the Python > >> plugin; that said, I'm still breaking the encapsulation in quite a few > >> places (hey, this is an experimental prototype). > >> > >> You can see the latest version of it within the "proposed-plugin-api" > >> branch of the Python plugin here: > >> http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=shortlog;h=refs/heads/proposed-plugin-api > >> > >> within the "proposed-plugin-api" subdirectory. > > > > Hmm, how do I get it? I did > > > > git clone http://git.fedorahosted.org/git/proposed-plugin-api > > > > but there is nothing in gcc-python-plugin/. And > > > > git checkout proposed-plugin-api > > > > says I'm already there ...? > > Meanwhile the directory magically appeared (heh ...).
[The ways of git are something of a mystery to me: 95% of the time it's the best revision control system I've ever used, but 5% of the time the most obtuse] > Overall it looks good Thanks for taking a look. > - but it seems to leak GCC headers into the > plugin API (via gcc-plugin.h and input.h inclusion). Thus, it > lacks separating the plugin API headers from the plugin API implementation > headers? That's true. The big information "leak" happens inside gcc-semiprivate-types.h, which defines the various structs that act like pointers, each with a decl like this: struct gcc_something { some_private_gcc_pointer_type inner; }; It would be possible to make this more opaque like this: struct gcc_something { struct some_private_gcc_struct *inner; }; given that you then don't need a full definition of that inner struct visible to users. Though location_t is leaked, and in this approach, there isn't a way around that, I think. > Or maybe I'm not looking at the right place. > I also dislike the use of GCC_PUBLIC_API, etc. - everything in > the plugin API headers should be obviously public - and the implementation > detail should be an implementation detail that should not need to care. I added that macro based on the example of other plugin/embedding systems I've seen (e.g. Python), where it's handy to make that macro be non-trivial on some platforms. See e.g. CPython's pyport.h: http://hg.python.org/cpython/file/9599f091faa6/Include/pyport.h where the macros PyAPI_FUNC and PyAPI_DATA have a 44-line definition, handling various different compatibility cases. For example, GCC_PRIVATE_API could have some magic linker flag that lets us automatically strip out those symbols so that they're not visible externally after the compiler has been linked. > >> If this landed e.g. in GCC 4.8, would this be usable by other plugins? > > For sure. I'd say get the copyright stuff sorted out and start pushing things > into the main GCC repository - where the obvious starting point is to > get the makefile, configure and install parts correct. > > I'd concentrate on providing enough API for introspection, like simple > things, counting basic-blocks, counting calls, etc. I'm not sure we > want to expose gcc_gimple_walk_tree or gcc_gimple_print (or > the gcc_printers for a start) - the latter would something that the > python plugin > would provide, resorting to introspecting the stmt itself. FWIW the Python plugin already heavily uses gcc's pretty-printer code, so that *is* something I'd want to wrap (but it's fairly easy to do so). > I also wonder about gcc_gimple_phi_upcast - why would you specialize > PHI nodes but not any others? I'd have exposed gcc_gimple_code. > In general the plugin API should provide exactly one (and the most flexible) > way to do a thing (thus, not have gcc_gimple_assign_single_p) and > the high-level consumers like the python plugin should provide > nice-to-use interfaces. This touches on the biggest thing that's missing in the API: the multitude of tree types, gimple statement types, and rtl types. All I added were the "base classes", and gimple-phi is the only instance so far in the API of a subclass. I would like the API to expose these (my plugin uses them to good effect), but doing it in C is likely to be messy: lots of upcasting and downcasting functions. An alternative approach might be to bite the bullet and go to C++; maybe something like this (caveat: I last did any C++ about a decade ago, no idea if the following compiles): // Everything exposed in the public headers are pure virtual functions // leading to abstract classes: no public data namespace gcc { // Abstract base class for objects managed by GCC's garbage-collector class gcmanaged { public: virtual void mark_in_use() const = 0; }; class location : public gcmanaged { // it isn't gcmanaged yet, but might become so public: virtual std::string get_filename() const = 0; virtual int get_line() const = 0; virtual int get_column() const = 0; }; namespace gimple { // enum to assist in RTTI by non-C++ code: enum code { INVALID_GIMPLE_CODE = -1, GIMPLE_NOP = 0, GIMPLE_ASSIGNMENT, GIMPLE_SWITCH, //etc; don't have to expose all of the regular gimple codes. NUM_CODES, }; class statement : public gcmanaged { public: virtual location *get_location() const = 0; virtual enum code get_code() const = 0; }; class nop : public statement { }; class phi : public statement { public: virtual tree::ssaname* get_lhs() const = 0; virtual std::list<tree::expr*, cfg::edge*> get_rhs() const = 0; // or whatever }; }; namespace tree { class declaration : public gcmanaged { public: virtual std::string get_name() const = 0; virtual location* get_location() const = 0; }; class funcdecl :: public declaration { public: virtual function* get_function() const = 0; virtual std::list<parmdecl*> get_arguments() const = 0; virtual resultdecl* get_result() const = 0; // etc: accessor functions }; class type : public gcmanaged { // etc... }; class integertype :: public type { // etc... }; // does there actually need to be a "tree" base class? I think // the tcc_codes express the base classes, at a first approximation // could even lose the "tree" concept altogether }; }; [though I'd greatly prefer to capitalize the classnames (sigh) :) ] The implementation would have entirely separate (private) headers, with something like: class location_impl : public gcc::location { private: location_t loc; public: // ctor: location_impl(location_t loc) : loc(loc) {} // implement the public API: void mark_in_use() const { // empty: location_t currently isn't gc managed } std::string get_filename() const; int get_line() const; int get_column() const; }; class phi_impl : public gcc::gimple::phi { private: gimple stmt; // of correct subtype public: // gcmanaged: void mark_in_use() const; // ctor: phi(gimple stmt); // accessors: location *get_location() const; enum code get_code() const; tree::ssaname *get_lhs() const; std::list<tree::expr*, cfg::edge*> get_rhs() const; }; That way the implementation is *entirely* hidden, with entirely separate headers (though during prototyping I'd need an escape hatch). But it does mean that the API is handing off pointers-to-pointers, rather than just pointers (allocating lots of tiny wrapper objects, like the ones above, so that the caller doesn't know the size, and so there can be a vtable). Does this sound sane? Long-term, perhaps these impl classes could perhaps *become* the implementations - everything's hidden. Thanks again for looking at this Dave