On Thu, Oct 4, 2012 at 1:50 PM, Janus Weil <ja...@gcc.gnu.org> wrote: >>> Thanks for the suggestions. The attached patch changes all "."-something >>> symbol names, which I found. >>> >>> Build and regtested on x86-64-gnu-linux. >>> OK for the trunk and 4.7? (".saved_dovar" also occurs in 4.6; we could also >>> backport that part to 4.6, but I am not sure whether it is needed.) >> >> I think at least for trunk it should be ok. > > One more comment: Since its appearance is a bit scattered in the code, > how about using a small macro which prepends the "_F" prefix to a > given variable name?
For "normal" identifiers in a module, the current scheme of "__modulename_MOD_symbolname" is probably too widely entrenched e.g. in debuggers and various interoperability toolkits to be worth changing at this point. The OOP stuff OTOH is IMHO sufficiently new that there is little harm in changing it. I was thinking about this in the beginning of the year and produced the attached document (I never sent it before as I realized I wouldn't have time to do anything about it myself in the near future). Funnily, I also came up with the idea of "_F" at that point, though maybe not so surprising as I also studied the g++ name mangling for inspiration. Also note that the document itself has a perhaps naive approach which does not consider backwards compatibility enough (see e.g. the above paragraph). Some related mangling PR's: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51802 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606 (not the PR itself, but the discussion in the comments) -- Janne Blomqvist
Gfortran name mangling ABI ========================== It would be nice if GFortran would have a documented, consistent name mangling ABI. This would reduce the risk of an inadvertent name clash with some other software, and make it easier for 3rd party tools such as debuggers, profilers etc. to demangle symbol names. If, and when, the ABI is broken due to the array descriptor update, we could also think about fixing this issue. An explicit non-goal of this is to come up with some common cross-compiler name mangling ABI, as it seems very unlikely that other compiler vendors will want to change their mangling, and mangling alone is a very small part of ABI compability. Another non-goal is to change the mangling of the "F77" interface (lowercase, append underscore(s) depending on the compiler options). Thus the following discussion refers only to mangling "F90+" names, e.g. module procedures and so forth. Rules of the road ----------------- Some names are "reserved identifiers", reserved for the implementation. Mangled names should be such reserved names, in order to not clash with user-defined names. Fortran specifies that names are of the form "letter + alphanum". C and C++ reserved identifiers (that is, identifiers which are reserved for use by the implementation) are - A name beginning with an underscore followed by a capital letter. - A name containing double underscore (plain C reserves only names beginning with double underscore, but C++ reserves anything with double underscores). - POSIX adds additional restrictions wrt various POSIX functionality. Thus, choosing names beginning with an underscore followed by either a capital letter or another underscore should be good. Current GFortran name mangling ------------------------------ Currently the name mangling is a bit ad-hoc, with several different prefixes depending on which part of the compiler is used: - Procedure "foo" in module "bar": __bar_MOD_foo - Runtime library: _gfortran_XXX - Runtime library internal functions (not visible if symbol visibility is supported): _gfortrani_XXX - ISO_C_BINDING functions in the library: __iso_c_binding_XXX - OOP stuff: __class_XXX - Others? The C++ name mangling --------------------- For inspiration, see the C++ name mangling ABI that GCC follows at http://sourcery.mentor.com/public/cxx-abi/abi.html#mangling http://sourcery.mentor.com/public/cxx-abi/abi-examples.html#mangling http://sourcery.mentor.com/public/cxx-abi/abi-mangling.html The C++ name mangling ABI, in a very simplified form, is - Everything has the prefix "_Z". - names are encoded as <length, name> pairs. - At the end of a function symbol there is a "E", followed by the type of the function arguments (in order to handle overloading). - Outside of names, characters have meaning as various flags, e.g. "TI" means the identifier is a typeinfo structure, or "TV" for a virtual table, and so on. E.g. a member function "foo(void)" in a class "Test" (Test::foo()) would thus be encoded as "_ZN4Test3fooEv". (The "N" means it's a nested name, "v" at the end means the void argument). Proposed GFortran mangling -------------------------- Fortran name mangling requirements are considerably simpler than C++, due to Fortran not having function overloading (yes, Fortran has generic interfaces, which are a bit different and don't require mangling), nor templates. - Every symbol has the prefix "_F" (F as in Fortran). This makes it easy to distinguish that a symbol comes from GFortran. "__gfortran" would also work, but is longer. - Nested names are encoded as <length, lower-cased name> pairs. The length fields deliminate different names in a mangled symbol, and allows allocating space for a name before reading it. - All runtime library symbols should have the prefix "_FR", and "_FRI" for library internal symbols. Strictly speaking, this might not actually be necessary, as we probably could just use plain "_F" with some care, but better safe than sorry. And as an exception to the above, library symbols don't use the <length, name> coding (alternative reason: they are not nested). - Special symbols can be prefixed by flag characters. Flags are: - I: Intrinsic. TODO: R (library) vs. I (intrinsic)? - TV: vtable (like the C++ mangling) - TI: typeinfo structure (currently GFortran combines the vtable with the typeinfo, but in case we want to split this at some point?) - Examples: - Procedure "foo" in module "bar": "_F3bar3foo" - vtable for the derived type "mytype" in module "foo": "_F3fooTV6mytype". Or "_FTV3foo6mytype"? - Parameter "C_INT" in intrinsic module iso_c_binding: "_FI13iso_c_binding5c_int" - "transfer_integer" function in the runtime IO library: "_FRtransfer_integer". Alternative scheme ------------------ Alternative scheme where every name is prefixed with uppercase flag characters which tells which kind of identifier it is. As the flags deliminate different names, the length fields have been omitted. Additional flags are: - M: Module - P: Procedure - V: Value (constant (parameter) or variable) - Examples: - Procedure "foo" in module "bar": "_FMbarPfoo" - vtable for the derived type "mytype" in module "foo": "_FMfooTVmytype" - Parameter "C_INT" in intrinsic module iso_c_binding: "_FMIiso_c_bindingVc_int"