On Thu, Oct 4, 2012 at 1:50 PM, Janus Weil <ja...@gcc.gnu.org> wrote:
>>> Thanks for the suggestions. The attached patch changes all "."-something
>>> symbol names, which I found.
>>>
>>> Build and regtested on x86-64-gnu-linux.
>>> OK for the trunk and 4.7? (".saved_dovar" also occurs in 4.6; we could also
>>> backport that part to 4.6, but I am not sure whether it is needed.)
>>
>> I think at least for trunk it should be ok.
>
> One more comment: Since its appearance is a bit scattered in the code,
> how about using a small macro which prepends the "_F" prefix to a
> given variable name?

For "normal" identifiers in a module, the current scheme of
"__modulename_MOD_symbolname" is probably too widely entrenched e.g.
in debuggers and various interoperability toolkits to be worth
changing at this point. The OOP stuff OTOH is IMHO sufficiently new
that there is little harm in changing it.

I was thinking about this in the beginning of the year and produced
the attached document (I never sent it before as I realized I wouldn't
have time to do anything about it myself in the near future). Funnily,
I also came up with the idea of "_F" at that point, though maybe not
so surprising as I also studied the g++ name mangling for inspiration.
Also note that the document itself has a perhaps naive approach which
does not consider backwards compatibility enough (see e.g. the above
paragraph).

Some related mangling PR's:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51802

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606 (not the PR itself,
but the discussion in the comments)

-- 
Janne Blomqvist
Gfortran name mangling ABI
==========================

It would be nice if GFortran would have a documented, consistent name
mangling ABI. This would reduce the risk of an inadvertent name clash
with some other software, and make it easier for 3rd party tools such
as debuggers, profilers etc. to demangle symbol names.

If, and when, the ABI is broken due to the array descriptor update, we
could also think about fixing this issue.

An explicit non-goal of this is to come up with some common
cross-compiler name mangling ABI, as it seems very unlikely that other
compiler vendors will want to change their mangling, and mangling
alone is a very small part of ABI compability.

Another non-goal is to change the mangling of the "F77" interface
(lowercase, append underscore(s) depending on the compiler
options). Thus the following discussion refers only to mangling "F90+"
names, e.g. module procedures and so forth.

Rules of the road
-----------------

Some names are "reserved identifiers", reserved for the
implementation. Mangled names should be such reserved names, in order
to not clash with user-defined names.

Fortran specifies that names are of the form "letter + alphanum".

C and C++ reserved identifiers (that is, identifiers which are
reserved for use by the implementation) are

- A name beginning with an underscore followed by a capital letter.

- A name containing double underscore (plain C reserves only names
  beginning with double underscore, but C++ reserves anything with
  double underscores).

- POSIX adds additional restrictions wrt various POSIX functionality.

Thus, choosing names beginning with an underscore followed by either a
capital letter or another underscore should be good.

Current GFortran name mangling
------------------------------

Currently the name mangling is a bit ad-hoc, with several different
prefixes depending on which part of the compiler is used:

- Procedure "foo" in module "bar": __bar_MOD_foo

- Runtime library: _gfortran_XXX

- Runtime library internal functions (not visible if symbol visibility
  is supported): _gfortrani_XXX

- ISO_C_BINDING functions in the library: __iso_c_binding_XXX

- OOP stuff: __class_XXX

- Others?


The C++ name mangling
---------------------

For inspiration, see the C++ name mangling ABI that GCC follows at 

http://sourcery.mentor.com/public/cxx-abi/abi.html#mangling

http://sourcery.mentor.com/public/cxx-abi/abi-examples.html#mangling

http://sourcery.mentor.com/public/cxx-abi/abi-mangling.html


The C++ name mangling ABI, in a very simplified form, is

- Everything has the prefix "_Z".

- names are encoded as <length, name> pairs.

- At the end of a function symbol there is a "E", followed by the type
  of the function arguments (in order to handle overloading).

- Outside of names, characters have meaning as various flags,
  e.g. "TI" means the identifier is a typeinfo structure, or "TV" for
  a virtual table, and so on.

E.g. a member function "foo(void)" in a class "Test" (Test::foo())
would thus be encoded as "_ZN4Test3fooEv". (The "N" means it's a
nested name, "v" at the end means the void argument).


Proposed GFortran mangling
--------------------------

Fortran name mangling requirements are considerably simpler than C++,
due to Fortran not having function overloading (yes, Fortran has
generic interfaces, which are a bit different and don't require
mangling), nor templates.

- Every symbol has the prefix "_F" (F as in Fortran). This makes it
  easy to distinguish that a symbol comes from GFortran. "__gfortran"
  would also work, but is longer.

- Nested names are encoded as <length, lower-cased name> pairs. The
  length fields deliminate different names in a mangled symbol, and
  allows allocating space for a name before reading it.

- All runtime library symbols should have the prefix "_FR", and "_FRI"
  for library internal symbols. Strictly speaking, this might not
  actually be necessary, as we probably could just use plain "_F" with
  some care, but better safe than sorry. And as an exception to the
  above, library symbols don't use the <length, name> coding
  (alternative reason: they are not nested).

- Special symbols can be prefixed by flag characters. Flags are:
  - I: Intrinsic. TODO: R (library) vs. I (intrinsic)?
  - TV: vtable (like the C++ mangling)
  - TI: typeinfo structure (currently GFortran combines the vtable
    with the typeinfo, but in case we want to split this at some
    point?)

- Examples:
  - Procedure "foo" in module "bar": "_F3bar3foo"
  - vtable for the derived type "mytype" in module "foo":
    "_F3fooTV6mytype". Or "_FTV3foo6mytype"?
  - Parameter "C_INT" in intrinsic module iso_c_binding:
    "_FI13iso_c_binding5c_int"
  - "transfer_integer" function in the runtime IO library:
    "_FRtransfer_integer".


Alternative scheme
------------------

Alternative scheme where every name is prefixed with uppercase flag
characters which tells which kind of identifier it is. As the flags
deliminate different names, the length fields have been
omitted. Additional flags are:

- M: Module
- P: Procedure
- V: Value (constant (parameter) or variable)

- Examples:
  - Procedure "foo" in module "bar": "_FMbarPfoo"
  - vtable for the derived type "mytype" in module "foo":
    "_FMfooTVmytype"
  - Parameter "C_INT" in intrinsic module iso_c_binding:
    "_FMIiso_c_bindingVc_int"

Reply via email to