On Fri, 16 Mar 2012 19:41:56 +0000 (UTC) "Joseph S. Myers" <jos...@codesourcery.com> wrote:
> On Fri, 16 Mar 2012, David Malcolm wrote: > > > Proposed outcome > [...] > > Current architectural issues > [...] > > Not many people commented on the architectural goals document Diego and I > posted at <http://gcc.gnu.org/ml/gcc/2011-12/msg00103.html>. Many of your > ideas seem potentially useful additions to it. I hope Diego will have > time soon to push more of the goals and development conventions through > community approval and move them to the main website; as we do this, > detailed feedback from people interested in these issues is certainly > welcome. I wanted many times to comment on these issues, but I am very scared of hurting people. Last time I spoke of GCC modularity (actually lack of it), some people felt insulted, which is certainly not my aim. So for many months I was too scared to comment on these issues. So if you feel insulted by any mine comments on these issues, please accept my apologies and recall than I am not a native English speaker, nor was I educated in north American universities or schools [except that I spent 1 year in California, as an 8 year old kid, in 1967; but this is not enough a cultural influence, and it was really a long time ago]. (I'm sure most people here would have difficulties in participating to mailing lists in French or in Russian, so please be kind with me.) First, my understanding of modules is that: * you can name and count the modules of a software * given a source line, or function, you can decide at a glance to which one module it belongs * the interface between modules is well documented I'm sorry to say that, but current GCC (ie 4.7 or today's trunk) is *not* modular. Don't feel injured by that fact. Indeed, GCC is a little less messy than it was a few years ago, but being less messy is not being modular IMHO. And something cannot be "half-modular". A good modular software is the Gnome/GTK "graphical interface" system (ok, it is not a compiler), and probably the KDE/Qt one. When you start learning it, you get a nice figure http://developer.gnome.org/ and sort of list of modules http://developer.gnome.org/platform-overview/stable/ ; we have absolutely nothing of that sort for GCC today. We cannot even think of a similar figure for GCC today. So I would be delighted if GCC was made of modules. But I have no idea of how that can be done. I would like global GCC experts to propose an organization of modules. This means that "global reviewers" type of experts would suggest at first: * A set of (a dozen or two) modules, each having a *name* and a short *description* (perhaps a single title phrase at first). * An imprecise mapping of functionalities or features, or preferably of current source files to modules Don't expect non-global GCC maintainers to provide such a list; you need to have an overall view of GCC to think of such a list; and very few people have such a global view; I (Basile) certainly don't have it: there are lots of parts of GCC I don't know about! Of course, I may have an opinion on a particular proposed set of modules (but I don't feel knowledgeable enough about all GCC to propose one). And making the current GCC core base modular is not easy (it might be impossible), because we cannot decide at a glance to what module a given current code should belong. An important thing if we want to go modular is to define a road-map and probably to accept the fact that, if GCC 5 is a modular GCC, is will have less functionalities, less optimizations, less power than the current GCC 4.7: we probably would have to temporarily drop some feature or power from GCC to make it modular (otherwise, that would be too gigantic an effort). I have no idea if we can do that (this is why I am sometimes pessimistic); probably most GCC contributors are paid by companies which might not afford that. I do believe that identifiers in GCC should be organized in such a way that the module they belong to is visible at once. I think that a prefix (à la GTK) or a C++ namespace should be great. In particular, this means that most GCC identifiers should change (which means that any such evolution is not syntactically gradual; it has to be made by huge, but "easy", patches). I would prefer much more that tree-s would be named foobar::tree in C++ or foobar_tree in C, where foobar is the name of the module defining trees and providing a API to manipulate them. I strongly want that all GCC names would change in a more organized way. It is a real pain, even with today's tools, to understand where are tree-s or edge-s or basic_block-s defined. GCC is not only a big bunch of source code, but also a set of "meta-programming" tools ie a set of generators. And it is good we have them (I certainly don't think we should remove them, and I am not naive enough to believe or want them to be replaced by tricky templates in C++). Again, all GCC internal code generators should be listed (it is still difficult to find an up-to-date list) and well documented. My belief is that GCC should aim to be hosted on today's machines. This means that we should use notably: shared libraries (we have almost none of them inside GCC) which can map nicely to modules (like GTK does), and probably a richer system interface than what the language standards provide. Libiberty is not enough: it cannot be even used by plugins. Perhaps we could choose some foundational module (like perhaps Glib or something else) providing small abstractions of system facilities. We also need GCC to really have plugins (ie give up the idea of a plugin-less GCC, which is useless). And we should have enough modularity so that each module could be extended or perhaps even superseded by an external plugin. If modules are plugins or shared libraries, then extending or adding them is easier, and working on GCC is faster. We should also define a set of non-modular, ie global, features or traits for GCC. I am in particular thinking about: * Naming conventions * a garbage collector. Even a modular GCC need some memory management policy (and ref-counting à la GTK, or à la std::shared_ptr is not enough IMHO inside a compiler because a compiler has much more complex and circular data structures, and much less hierarchically organized, that a graphical tookit has). We should define a memory policy in garbage collector's terms. (and ggc is a very bad GC, it should be improved, not removed). * a set of conventions and a module for "dumping" (like our -fdump... today) and for program arguments (ie optimization or other flags). I have myself not a precise understanding of what -fdump means today... (just a feeling of it). * a module and set of conventions for diagnostic reporting to the user * the requirement that each major internal representation (like Gimple, Tree, ...) should be dumpable, serializable in a textual way (perhaps JSON or YAML like), and loadable (so we would have a parser able to construct Gimple in memory from the serialized textual representation in a faithful manner). * a set of conventions regarding our meta-programming code generators (gengtype, genattr, ..) * Meta-data about core types and modules is IMHO very important. The Gobject introspection machinery of GTK3 made interfacing GTK to external software (in particular, gluing it to interpreters) much much easier. We should probably have an equivalent thing: a machine-level formalization of the main GCC APIs and the major GCC internal data. (This would also help more robust plugins: it could query a particular GCC installation about the feature it has). * Some documentation should be generated from source code (with a comment convention), like doxygen or something else. (I hope that the license issue is solved on this). My feeling is that making GCC modular is a huge task (and we'll have to accept that some current features would be first dropped to make it feasible). I have no idea if it is realistic. (But I feel that in the long run, if GCC is remaining non-modular, it will attract fewer and fewer new developers and gradually will become less and less relevant). My first wish is that someone (a "global reviewer" probably) would propose a tentative list of modules to discuss. I hope I did not hurt anyone. If I did, please accept my apologies. Feel free to ignore that email. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***