On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote: > Daniel Berlin wrote: > >>> I agree that, at least in principle, it should be possible to emit the >>> debug >>> info (whether the format is DWARF, Stabs, etc.) once. >> >> No, you can't. >> You would at least have to emit the variables separate from the types >> (IE emit debug info twice). > > Yes, of course; that's what everyone is talking about, I think. "Emit" here > may also mean "cache in memory some place", rather than "write to a file". > It could mean, for example, fill in the data structures we already use for > types in dwarf2out.c early, and then throw away the front-end type > information Okay, then let us go through the options, and you tell me which you are suggesting:
If you assume LTO does not have access to the front ends, your options look something like this: When you first compile each file: Emit type debug info Emit LTO When you LTO them all together Do LTO Emit variable debug info Under this option, "Emit variable info" requires being able to reference the types. If you've lowered the types, this is quite problematic. So either you get to store label names for the already output type debug info with the variables (so you can still reference the type you output properly when you read it back in). This is fairly fragile, to be honest. Another downside of this is that you can't eliminate duplicate types between units because you don't know which types are really the same in the debug info. You have to let the Another option is: When you first compile each file: Emit type debug info Emit partial variable debug info (IE add pointers to outputted types but not to locations) Emit LTO When you LTO them all together: Do LTO Parse and update variable debug info to have locations Emit variable debug info This requires parsing the debug info (in some format, be it DWARF or some generic format we've made up) so that you can update the variable info's location. As a plus, you can easily update the types where you need to. Unlike the first option, because you understand the debug info, you can now remove all the duplicate types between units without having to have the linker do it for you. Unless you link in every single frontend to LTO1 (Or move a lot to the middle end), there is no way to do the following: When you first compile each file: Emit LTO When you LTO them all together: Emit type debug info Do LTO Emit variable debug info If you don't want to link the frontends, you could also get away with moving a lot of junk to the middle end (everything from being able to distinguish between class and struct to namespaces, the context of lexical blocks) because debug info outputting uses language specific nodes all over the place right now. Unless i've missed something, our least fragile and IMHO, best option requires parsing back in debug info. It is certainly *possible* to get debug info without parsing the debug info back in. Then again, I also don't see what the big deal about adding a debug info parser is. It's not like they are all that large. [EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.* bytereader-inl.h dwarf2enums.h dwarf2reader* 40 bytereader.cc 110 bytereader.h 118 bytereader-inl.h 465 dwarf2enums.h 797 dwarf2reader.cc 373 dwarf2reader.h 1903 total (This includes both a callback style reader that simply hands you thinks you tell it to, as well as something that can read back into a format much like we use during debug info output)