My ongoing work to implement the multilib selection changes described at <http://gcc.gnu.org/ml/gcc/2010-01/msg00063.html> will in due course require option-related hooks to be shared between the driver and the compilers proper (cc1 etc.). As we do not currently have a hooks system in the driver, it seems appropriate to consider what design we want for hooks extended into this part of the compiler - and, more generally, what we would like the system for target configuration to look like.
(In this message I only consider designs for C. It is certainly possible that in future a native C++ approach with less heavy use of macros may be used, but I think the same general issues arise.) The basic design for target hooks was implemented by Neil Booth in June 2001 following a discussion <http://gcc.gnu.org/ml/gcc/2001-06/subjects.html#00932> I started (this was not the first time the principle of moving away from macros had come up), with langhooks following later that year, the targhooks.c system for incremental transition of individual macros to hooks being added in 2003, and automatic generation of much of the boilerplate code from a single .def file being added much more recently by Joern Rennecke; we now have about 300 target hooks. The point that such functions should be linkable into the driver came up in the second message of that thread. The motivations for moving from macros to hooks remain as discussed then: cleaner design, better-specified interfaces with prototypes (so eliminating one cause of warnings building target-independent files only when configured for some targets, or errors from code conditioned with #if) and potentially the ability to swap out target structures for multi-target compiler binaries. In general, the existing target hooks are defined through #define within the target .c file, before the targetm variable is defined with its initializer. This works well for hooks that largely depend on the target architecture alone, not on the target OS. There are some exceptions where OS-dependent hooks are defined in the .h files listed in $tm_file, and some cases where targets modify hooks at runtime (this should generally be for hooks that are integer or string constants rather than functions, though I have not checked whether any function hooks are also being modified at runtime). Such cases would make a multi-target compiler (with each source file compiled only once, but multiple target OSes for a single architecture) a bit harder; targetm would need to move to its own .c file, with only that .c file being compiled separately for each target. Some target macros - which should become hooks in some form - are much more dependent on the target OS. This applies, in particular, to all the various specs used by the driver. Given this, my inclination is that the driver's targetm structure should be defined in its own .c file, with the macros providing its initializer generally coming from .h files. This isn't very far distant from the present system for specs (where the macros initialize variables and otherwise the variables are generally what's used, potentially being modified at runtime when a specs file is read). The advantage as I see it would come if the target .h files could be split up, with the driver-only defines coming from a separate set of headers from those used in the core compiler. The ones used in the core compiler would likely be simpler and have fewer OS dependencies. It might make it easier to move towards using these headers consistently in the order given in config.gcc as the preferred order - with the aim being to avoid architecture-specific cases in config.gcc mentioning architecture-independent headers at all. (A case statement over target architectures should deal with architecture-specific configuration; one over OSes should deal with OS-specific configuration; one over pairs should deal with the combination; and target-independent code should put together the lists from each of those case statements in the standard order.) This separation of configuration for different purposes is closely related to the issues with hooks for option handling. Target .h files configuration used in various ways, with some macros used in more than one way: * Definitions for use in the target's own .c files. * Definitions for use in code in the target's .md files. * Definitions for use in code in the core compiler (middle-end and front ends). tm_p.h has prototypes for use in such code, and an intermediate goal in converting macros to hooks might be to convert every macro that uses a function on any target, so that tm_p.h is only included in the target's .c files and in the generated files containing code from the .md files. * Definitions for use in the driver. * Definitions for use in collect2. These overlap with those used in the driver (e.g. MD_EXEC_PREFIX), and probably with those used elsewhere (e.g. OBJECT_FORMAT_*). My inclination is to say that collect2 and lto-wrapper are really part of the driver although they happen to be in separate executables. As such, and in order to keep down the number of different sets of configuration information needed, at the point where collect2 starts using hooks I'd be inlined to have it use the driver's set of hooks, and if any of those hooks end up being functions needing to link with other driver code, the relevant code should go in libdriver.a or similar, and collect2 should be made to use the shared diagnostics code as needed. (lto-wrapper doesn't use any target macros.) * Definitions for use in gcov. I'm not sure what if any such definitions there are, but gcov.c includes tm.h. (java/jvgenmain.c also includes tm.h, but at least has a FIXME indicating it's for a particular declaration that should move somewhere else.) * Definitions for target libraries. This means libgcc, crt*.o, libgcov, libobjc, and some of the ada/*.c files that go in Ada's runtime. Nathan Froyd has been working on these. The aim is to completely separate the host-side tm.h from target-side configuration. In most cases compiler predefined macros can be used; in some cases, .h files in the toplevel libgcc/config (and selected through libgcc/config.host) will be used. Once target libraries no longer use host-side headers, all the appearances of the runtime library license exception in such headers should go away. Various macros used to indicate whether code is being built for the host or the target, such as I listed in <http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00947.html>, could also go away. * Definitions that are confused about whether they describe host or target properties. See what I said in <http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01007.html> about what would happen with GCC_DRIVER_HOST_INITIALIZATION if you tried to build a cross compiler from DJGPP to xtensa-elf. As another example, consider GET_ENVIRONMENT. The default value is in defaults.h, as if it were a target macro, but surely it is something that depends on the host not the target. If it belongs anywhere, it's probably system.h, but since there are no non-default definitions and various direct calls of getenv, this macro should probably just go away altogether. My proposal is that we aim to arrange the main classes of definitions thus: * If something is used only in the target's own host-side files (including .md files), it's no problem for it to be a macro. There should be headers for inclusion in these files that are not included elsewhere. (Multi-target compilers would eventually require things depending on the OS to be separated out of such headers into an architecture-specific target hooks structure.) * Definitions for code built for the target should be completely separate from those for code built for the host, and located entirely in separate toplevel directories such as libgcc/ when they aren't built-in macros. * Host-side configuration may be split into that for the core compilers, that for the driver, and that which is shared between them (with a limited amount of front-end-specific hooks as well). By avoiding separate headers and structures for every combination of executables - by treating collect2 as part of the driver, in particular - things should be kept comparatively maintainable, and we can reasonably have a targetm_common without worrying about multiple possible definitions of "common". * Hooks for the driver would work based on macros being defined in .h files. Those .h files would preferably be separate from the existing tm.h, but at first they might be the same, and even if separated out at an early stage would likely include the existing tm.h to facilitate those specs that depend on other macros for such things as determining whether 32-bit or 64-bit is the default. (Later parts of my multilibs changes, by allowing specs to test features rather than option text, would reduce the need for such compile-time conditionals in specs and so maybe reduce the need for tm.h to be included when defining them.) * Hooks shared between the driver and the core compilers would probably work as the present target hooks work - an architecture-specific .c file containing options-related functions and a targetm_common initializer. Initially this file might include the existing tm.h, with the aim eventually being just to include any headers for target facilities shared between the driver and the core compiler. * It would be reasonable, although not required for any of the above, to put driver code in a driver/ subdirectory, config/ code only used in the driver in driver/config/, and code shared between the driver and the core compiler in some subdirectory such as common/ (with its own config/ subdirectory). Any comments on the above? The only part I would be proposing to implement myself at present would be the hooks structure for hooks shared between the driver and the core compilers, at the point where I need it in the multilibs changes if not already implemented by then. What are people's views on the extent to which cleanup / hook conversion patches - not specifically the above, but the existing ongoing hook conversion and cleanups in general - should go in during Stage 3? Such patches do inevitably have some risk of breaking things. (If we fixed all the problems building the compiler with --enable-werror-always for any target, using a recent native trunk compiler to build the cross compiler, that would at least provide a more reliable way of detecting some problems with changes affecting many targets. See bug 44756's dependencies for a not necessarily exhaustive list of problems building this way.) -- Joseph S. Myers jos...@codesourcery.com