On Thu, Dec 14, 2017 at 2:21 AM Anton via Phabricator
<revi...@reviews.llvm.org <mailto:revi...@reviews.llvm.org>> wrote:
xgsa added a comment.
In https://reviews.llvm.org/D39622#954585, @probinson wrote:
> Philosophically, mangled names and DWARF information serve
different purposes, and I don't think you will find one true
solution where both of them can yield the same name that
everyone will be happy with. Mangled names exist to provide
unique and reproducible identifiers for the "same" entity
across compilation units. They are carefully specified (for
example) to allow a linker to associate a reference in one
object file to a definition in a different object file, and
be guaranteed that the association is correct. A demangled
name is a necessarily context-free translation of the mangled
name into something that has a closer relationship to how a
human would think of or write the name of the thing, but
isn't necessarily the only way to write the name of the thing.
>
> DWARF names are (deliberately not carefully specified)
strings that ought to bear some relationship to how source
code would name the thing, but you probably don't want to
attach semantic significance to those names. This is rather
emphatically true for names containing template parameters.
Typedefs (and their recent offspring, 'using' aliases) are
your sworn enemy here. Enums, as you have found, are also a
problem.
>
> Basically, the type of an entity does not have a unique
name, and trying to coerce different representations of the
type into having the same unique name is a losing battle.
I'm actually going back and forth on this ^. It seems to me,
regardless of mangled names, etc, it'd be good if LLVM used the
same name for a type in DWARF across different translation units.
And, to a large extent, we do (the case of typedefs in template
parameters doesn't seem to present a problem for the current
implementation - the underlying type is used), enums being one
place where we don't - and we don't actually make it that much
closer to the source/based on what the user wrote.
Even if the user had: "enum X { Y = 0, Z = 0; } ... template<enum
X> struct foo; ... foo<Z>" LLVM still describes that type as
"foo<X::Y>". Also if you have "enum X: int; ... foo<(X)0>" you
get "foo<0>" whereas in another translation unit with a
definition of X you'd get "foo<X::Y>".
So for consistency there, I kind of think maybe a change like
this isn't bad.
But of course the specific way a template name is written may
easily still vary between compilers, so relying on it being
exactly the same might not be a great idea anyway...
Thank you for clarification, Paul! Nevertheless, I suppose,
showing actual type of a dynamic variable is very important
for the projects, where RTTI is used. Moreover, it works
properly in gcc+gdb pair, so I am extremely interested in
fixing it in clang+lldb.
I understand that the suggested solution possibly does not
cover all the cases, but it improves the situation and
actually covers all the cases found by me (I have just
rechecked -- typedefs/usings seems to work fine when
displaying the real type of variable). If more cases are
found in future, they could be fixed similarly too. Moreover,
the debuggers already rely on the fact that the type name
looks the same in RTTI and DWARF, and I suppose they have no
choice, because there is no other source of information for
them (or am I missing something?).
I think they would have a choice, actually - let's walk through
it...
It sounds like you're thinking of two other possibilities:
1) "I suppose, we cannot extend RTTI with the debug type name (is
it correct?)" - yeah, that's probably correct, extending the RTTI
format probably isn't desirable and we'd still need a
singular/canonical DWARF name which we don't seem to have (& the
RTTI might go in another object file that may not have debug
info, or debug info generated by a different compiler with a
different type printing format, etc... )
2) Extending DWARF to include the mangled name
Sort of possible, DW_AT_linkage_name on a DW_AT_class could be
used for this just fine - no DWARF extension required.
But an alternative would be to have debuggers use a more
semantically aware matching here. The debugger does have enough
information in the DWARF to semantically match "foo<(X)0>" with
"foo<X::Y>". enum X is in the DWARF, and the enumerator Y is
present with its value 0.
Another case of Clang's DWARF type printing differing from a
common demangling, is an unsigned parameter. template<unsigned>
foo; foo<0> - common demangling for this is "foo<0u>" but Clang
will happily render the type as "foo<0>" - this one seems less
easy to justify changing than the enum case (the enum case, given
the declared-but-not-defined enum example, seems more compelling
to try to have clang give a consistent name to the type (which,
while not complete (differing compilers could still use different
printings), seems somewhat desirable)) because it's at least
self-consistent.
Again, in this case, a debugger could handle this.
All that said, GDB is the elephant in the room and I imagine
might have no interest in adopting a more complex name
lookup/comparison strategy & we might just have to bow to their
demangling printing and naming scheme... but might be worth
asking GDB folks first? Not sure.
Another advantage of this solution is that it doesn't require
any format extension and will probably work out of the box in
gdb and other debuggers. Moreover, I have just rechecked, gcc
generates exactly the same type names in DWARF for examples
in the description.
On the other hand, I understand the idea you have described,
but I am not sure how to implement this lookup in another
way. I suppose, we cannot extend RTTI with the debug type
name (is it correct?). Thus, the only way I see is to add
additional information about the mangled type name into
DWARF. It could be either a separate section (like
apple_types) or a special node for
TAG_structure_type/TAG_class_type, which should be indexed
into map for fast lookup. Anyway, this will be an extension
to DWARF and will require special support in a debugger.
Furthermore, such solution will be much complicated (still I
don't mind working on it).
So what do you think? Is the suggested solution not full or
not acceptable? Do you have other ideas how this feature
should be implemented?
P.S. Should this question be raised in mailing list? And if
yes, actually, in which ones (clang or lldb?), because it
seems related to both clang and lldb?
https://reviews.llvm.org/D39622