[Lldb-commits] [PATCH] D73206: Pass `CompileUnit *` along `DWARFDIE` for DWZ

Jan Kratochvil via Phabricator via lldb-commits Sun, 09 Feb 2020 14:54:54 -0800

jankratochvil added a comment.

In D73206#1859944 <https://reviews.llvm.org/D73206#1859944>, @labath wrote:


> From a high-level, this approach looks like it could be viable. However, you 
> have taken this a lot further than I would have expected. What I was 
> expecting to see would be that a bunch of the DWARF internal functions would 
> get an extra argument (be it a CompileUnit, DWARFUnit, or whatever), but I 
> did not expect that this would also appear on the "generic" interfaces. My 
> idea was that the external interface would keep getting just a single 
> user_id_t, but then during the process of decoding that integer into a 
> DWARFDie, we would also produce the additional CompileUnit object.


In such case there need to be two kinds of `user_id_t`. One internal to DWARF/ 
not containing MainCU and one for "generic" interfaces which does contain 
MainCU. Because internally if `user_id_t` contains MainCU then already also 
`DWARFDIE` and `DIERef` must contain MainCU - see for example 
`ElaboratingDIEIterator` calling `DWARFDIE::GetID` where is no chance to know 
the MainCU (without heavy refactorization to pass it down the call chain) - and 
the code does not really need to know MainCU, it is just using `user_id_t` as a 
unique DIE identifier.

> Now, that doesn't mean that adding a CompileUnit argument to these functions 
> too is a bad idea. Quite the opposite -- _if it actually works_, it would be 
> great because it would give us a lot more freedom in encoding the information 
> into the user_id_t value. However, I have some doubts about that. For 
> example, right now we don't create any CompileUnit objects for types  in 
> DWARF type units, and I'm not sure how much we want to change that because of 
> the sheer number of type units.  That means we'd still have to have a way to 
> encode a precise type location in a user_id_t without a CompileUnit, though 
> maybe we would be ablle to assume that a CompileUnit is always available in 
> the DWZ case (?).

Yes, if there is no `CompileUnit` available then DWZ is not used for that type 
and the `DWARFUnit` itself can be used. `CompileUnit *` can remain `nullptr` 
when not known.

The new plan:

- `SymbolFile` and `SB*` API should say unchanged with MainCUs encoded into 
`user_id_t` they use
- types like `Function` do contain `CompileUnit *` but still it may be easier 
for the APIs to encode MainCU into their `user_id_t`.
- indexes contain now `DIERef`. They should contain rather a new 
MainCU-extended variant of `DIERef`.
- `DWARFASTParser` (+`DWARFASTParserClang`) API - there must be an additional 
`CompileUnit *` parameter as it uses `DWARFDIE`. (`DWARFDIE` cannot fit MainCU 
into its 16 bytes.)
- `DWARFMappedHash` also must have new MainCU field I think.
- `NameToDIE` should be using the new MainCU-enhanced `DIERef`.

Given that `user_id_t` and `DIERef` will exist both with and without MainCU 
(depending on which code is using them) it would be nice to make a new 
MainCU-extended inherited type for them so that their mistaken use is 
compile-time checked. But `user_id_t` is just an integer typedef so it would 
need to be class-wrapped.  For `user_id_t` one needs to also make a new 
MainCU-extended type of `UserID`.

> This duality does make me think that we'd be better off in not adding the 
> CompileUnit arguments and encoding everything into a user_id_t. However, I am 
> not sure if we really have a choice here -- it may be that this information 
> is so complex that it will not be possible to reasonably encode it into 64 
> bits.

That my DWZ branch (`git clone -b dwz git://git.jankratochvil.net/lldb`) 
already encodes MainCU to all of `DWARFDIE` (becoming 24B instead of 16B), 
`DIERef` (staying 8B) and `user_id_t` (staying 8B, reducing DWO unit number 
from 31 bits to 29 bits). One can encode MainCU into the DWO field as both can 
never exist together.

> PS. It would be easier to evaluate this if this was based on master instead 
> of some patch which is taking the source in the direction we don't want to go 
> it.

I need to verify the new refactorization satisfies all the requirements of my 
existing functional non-regressing DWZ implementation I have. I would then have 
to create a second proof of concept patch being to post here based on `master`. 
Which I hope is not required for your review before its real implementation. I 
understand the final refactorization patch will be based on `master` and DWZ 
will be rebased on top of it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73206/new/

https://reviews.llvm.org/D73206



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D73206: Pass `CompileUnit *` along `DWARFDIE` for DWZ

Reply via email to