Hooks, macros and target configuration

Joseph S. Myers Tue, 19 Oct 2010 14:55:55 -0700

My ongoing work to implement the multilib selection changes described
at <http://gcc.gnu.org/ml/gcc/2010-01/msg00063.html> will in due
course require option-related hooks to be shared between the driver
and the compilers proper (cc1 etc.).  As we do not currently have a
hooks system in the driver, it seems appropriate to consider what
design we want for hooks extended into this part of the compiler -
and, more generally, what we would like the system for target
configuration to look like.


(In this message I only consider designs for C.  It is certainly
possible that in future a native C++ approach with less heavy use of
macros may be used, but I think the same general issues arise.)

The basic design for target hooks was implemented by Neil Booth in
June 2001 following a discussion
<http://gcc.gnu.org/ml/gcc/2001-06/subjects.html#00932> I started
(this was not the first time the principle of moving away from macros
had come up), with langhooks following later that year, the
targhooks.c system for incremental transition of individual macros to
hooks being added in 2003, and automatic generation of much of the
boilerplate code from a single .def file being added much more
recently by Joern Rennecke; we now have about 300 target hooks.  The
point that such functions should be linkable into the driver came up
in the second message of that thread.

The motivations for moving from macros to hooks remain as discussed
then: cleaner design, better-specified interfaces with prototypes (so
eliminating one cause of warnings building target-independent files
only when configured for some targets, or errors from code conditioned
with #if) and potentially the ability to swap out target structures
for multi-target compiler binaries.

In general, the existing target hooks are defined through #define
within the target .c file, before the targetm variable is defined with
its initializer.  This works well for hooks that largely depend on the
target architecture alone, not on the target OS.  There are some
exceptions where OS-dependent hooks are defined in the .h files listed
in $tm_file, and some cases where targets modify hooks at runtime
(this should generally be for hooks that are integer or string
constants rather than functions, though I have not checked whether any
function hooks are also being modified at runtime).  Such cases would
make a multi-target compiler (with each source file compiled only
once, but multiple target OSes for a single architecture) a bit
harder; targetm would need to move to its own .c file, with only that
.c file being compiled separately for each target.

Some target macros - which should become hooks in some form - are much
more dependent on the target OS.  This applies, in particular, to all
the various specs used by the driver.  Given this, my inclination is
that the driver's targetm structure should be defined in its own .c
file, with the macros providing its initializer generally coming from
.h files.  This isn't very far distant from the present system for
specs (where the macros initialize variables and otherwise the
variables are generally what's used, potentially being modified at
runtime when a specs file is read).

The advantage as I see it would come if the target .h files could be
split up, with the driver-only defines coming from a separate set of
headers from those used in the core compiler.  The ones used in the
core compiler would likely be simpler and have fewer OS dependencies.
It might make it easier to move towards using these headers
consistently in the order given in config.gcc as the preferred order -
with the aim being to avoid architecture-specific cases in config.gcc
mentioning architecture-independent headers at all.  (A case statement
over target architectures should deal with architecture-specific
configuration; one over OSes should deal with OS-specific
configuration; one over pairs should deal with the combination; and
target-independent code should put together the lists from each of
those case statements in the standard order.)

This separation of configuration for different purposes is closely
related to the issues with hooks for option handling.  Target .h files
configuration used in various ways, with some macros used in more than
one way:

* Definitions for use in the target's own .c files.

* Definitions for use in code in the target's .md files.

* Definitions for use in code in the core compiler (middle-end and
  front ends).  tm_p.h has prototypes for use in such code, and an
  intermediate goal in converting macros to hooks might be to convert
  every macro that uses a function on any target, so that tm_p.h is
  only included in the target's .c files and in the generated files
  containing code from the .md files.

* Definitions for use in the driver.

* Definitions for use in collect2.  These overlap with those used in
  the driver (e.g. MD_EXEC_PREFIX), and probably with those used
  elsewhere (e.g. OBJECT_FORMAT_*).

  My inclination is to say that collect2 and lto-wrapper are really
  part of the driver although they happen to be in separate
  executables.  As such, and in order to keep down the number of
  different sets of configuration information needed, at the point
  where collect2 starts using hooks I'd be inlined to have it use the
  driver's set of hooks, and if any of those hooks end up being
  functions needing to link with other driver code, the relevant code
  should go in libdriver.a or similar, and collect2 should be made to
  use the shared diagnostics code as needed.  (lto-wrapper doesn't use
  any target macros.)

* Definitions for use in gcov.  I'm not sure what if any such
  definitions there are, but gcov.c includes tm.h.  (java/jvgenmain.c
  also includes tm.h, but at least has a FIXME indicating it's for a
  particular declaration that should move somewhere else.)

* Definitions for target libraries.  This means libgcc, crt*.o,
  libgcov, libobjc, and some of the ada/*.c files that go in Ada's
  runtime.

  Nathan Froyd has been working on these.  The aim is to completely
  separate the host-side tm.h from target-side configuration.  In most
  cases compiler predefined macros can be used; in some cases, .h
  files in the toplevel libgcc/config (and selected through
  libgcc/config.host) will be used.

  Once target libraries no longer use host-side headers, all the
  appearances of the runtime library license exception in such headers
  should go away.  Various macros used to indicate whether code is
  being built for the host or the target, such as I listed in
  <http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00947.html>, could
  also go away.

* Definitions that are confused about whether they describe host or
  target properties.  See what I said in
  <http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01007.html> about what
  would happen with GCC_DRIVER_HOST_INITIALIZATION if you tried to
  build a cross compiler from DJGPP to xtensa-elf.  As another
  example, consider GET_ENVIRONMENT.  The default value is in
  defaults.h, as if it were a target macro, but surely it is something
  that depends on the host not the target.  If it belongs anywhere,
  it's probably system.h, but since there are no non-default
  definitions and various direct calls of getenv, this macro should
  probably just go away altogether.

My proposal is that we aim to arrange the main classes of definitions
thus:

* If something is used only in the target's own host-side files
  (including .md files), it's no problem for it to be a macro.  There
  should be headers for inclusion in these files that are not included
  elsewhere.  (Multi-target compilers would eventually require things
  depending on the OS to be separated out of such headers into an
  architecture-specific target hooks structure.)

* Definitions for code built for the target should be completely
  separate from those for code built for the host, and located
  entirely in separate toplevel directories such as libgcc/ when they
  aren't built-in macros.

* Host-side configuration may be split into that for the core
  compilers, that for the driver, and that which is shared between
  them (with a limited amount of front-end-specific hooks as well).
  By avoiding separate headers and structures for every combination of
  executables - by treating collect2 as part of the driver, in
  particular - things should be kept comparatively maintainable, and
  we can reasonably have a targetm_common without worrying about
  multiple possible definitions of "common".

* Hooks for the driver would work based on macros being defined in .h
  files.  Those .h files would preferably be separate from the
  existing tm.h, but at first they might be the same, and even if
  separated out at an early stage would likely include the existing
  tm.h to facilitate those specs that depend on other macros for such
  things as determining whether 32-bit or 64-bit is the default.
  (Later parts of my multilibs changes, by allowing specs to test
  features rather than option text, would reduce the need for such
  compile-time conditionals in specs and so maybe reduce the need for
  tm.h to be included when defining them.)

* Hooks shared between the driver and the core compilers would
  probably work as the present target hooks work - an
  architecture-specific .c file containing options-related functions
  and a targetm_common initializer.  Initially this file might include
  the existing tm.h, with the aim eventually being just to include any
  headers for target facilities shared between the driver and the core
  compiler.

* It would be reasonable, although not required for any of the above,
  to put driver code in a driver/ subdirectory, config/ code only used
  in the driver in driver/config/, and code shared between the driver
  and the core compiler in some subdirectory such as common/ (with its
  own config/ subdirectory).

Any comments on the above?  The only part I would be proposing to
implement myself at present would be the hooks structure for hooks
shared between the driver and the core compilers, at the point where I
need it in the multilibs changes if not already implemented by then.

What are people's views on the extent to which cleanup / hook
conversion patches - not specifically the above, but the existing
ongoing hook conversion and cleanups in general - should go in during
Stage 3?  Such patches do inevitably have some risk of breaking
things.  (If we fixed all the problems building the compiler with
--enable-werror-always for any target, using a recent native trunk
compiler to build the cross compiler, that would at least provide a
more reliable way of detecting some problems with changes affecting
many targets.  See bug 44756's dependencies for a not necessarily
exhaustive list of problems building this way.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Hooks, macros and target configuration

Reply via email to