On 25.01.2010 20:12, Ian Lance Taylor wrote:
Timothy Madden<terminato...@gmail.com> writes:
[...]
g++ is free software. A clean implementation of export would
certainly be accepted. All it takes is for somebody to write one.
Hello
Is that statement above still true please ?
I know export is now to be made deprecated by the next version of the
C++ standard, but I am still interested in what does it take to
implement it in gcc and how would a possible design look like.
For this, I remember that in times long forgotten someone on this list
actually tried to negotiate implementing export with some potential
"client" or interested party. They must have quickly gave up back then
since to my knowledge export has not been approached in gcc yet.
But if you please have or you can remember any design approach/decisions
for export made then, it will very much help me to know about it.
Than I would like to get your opinion if possible with the sort of
"design" I could come up with at a first attempt. First, I am interested
in a design for export that does not require sources to be available for
the instantiation phase. For this, .cc files would compile into several
files file:
- the regular object file .obj
- some shared, project-global .sym file with a repository
of immutable (read-only) symbols from the symbol table, for
symbols encountered throughout the entire program.
- some .tpl file with the parse tree for the template
definitions, together with references (into the .sym file)
for all symbols in the symbol table at the point of definition
- some .tpi file listing the template instantiations (usages),
or the <template-id>s with references to the entire symbols
table at every point of instantiation. This references are
indexes into the symbols repository.
The above are generated at translation phase, which needs to prepare
everything for instantiation.
After a complete compilation, the symbols repository will most likely
collect symbols from the entire project, including static symbols. Each
symbol in the repository has to be properly indexed and versioned, to
allow both complete and incomplete definitions for the same symbol, and
to allow multiple static symbols with the same name. Each symbol should
also include a list of .tpl and .tpi files that reference them. In this
way when a source file is re-compiled and it no longer uses a given
symbol, it can be removed from the repository. Its index/version number
is not re-used. The repository is extended/updated as each source file
is being compiled/recompiled when the .tpl/.tpi files are generated. Any
symbol referenced from the .tp? files is added to the repository if not
already there.
The instantiation phase then needs only the above three files (.sym,
.tpl, .tpi), and from them it generates some additional object code for
the linker. When instantiation triggers other instantiations, multiple
.tpl files join the context. This is where the look-up in multiple
symbol tables has to be done.
It is here when the ODR has to be checked. Every template that gets
instantiated keeps a record of all symbols that were looked up for it,
together with the look-up result for each. I will need some symbol
look-up log file .slk for that. Think of it like a replication log file
for database replication.
Then, when an instantiation is encountered for the same template-id
(same template, same arguments), the template need not be re-compiled,
but the sequence of symbol look-ups has to be replicated, to ensure the
same results are produced at the new point of instantiation. If look-up
results differ, a smart error message (suggestions welcome) will be output.
This is where I got until now. My problems now are:
- the old parallel-compilation-problem. A shared repository can
not really be accessed by multiple gcc instances without
database-like locking, which I think is not the domain of
gcc. So I think I could allow like 4 such repositories to be
created and used in parallel, for a quad-core CPU , and have
the fifth gcc instance that fires up enter a locking wait
state based on a lockfile, until one of the other four is
done.
- the repository file size. I hear pre-compiled header files are
50 to 200 MB in size. Since the repository includes symbols
from the entire program, not only standard headers, I expect
it will be up to twice as large. I guess each client will have
to decide for themselves if this size is a good price to pay
for the export feature.
- it is still not clear to me how to keep the repository
up-to-date when only some of the source files change.
- I would like to know if it would be useful for all these
files types to have a public format, so they can be used by
other tools, gcc ports or even by other compiler vendors,
like COFF format is now.
- I guess the .tpl file should also include look-up results for
the non-dependent names, for later use in the actual
compilation.
- the old vague-linkage problem, which in my case is the format
for the additional object code generated from
instantiation phase. Somehow I need to store each
template-id independent of the others (like a different file,
but then there will be too many files on the file system), and
after that I need to make each one depend (in the project
build system) on the singe .tpl file with the template
definition, and on the replication log for the symbol
look-up. This is were things start to get a little fuzzy for
me.
What do you think ?
What can you tell me about it ?
I know many prefer to regard 'export' as dead subject, and I am sorry
for the long post if that is the case, but otherwise I just feel I could
use any help or opinion that I can get.
Thank you,
Timothy Madden