labath abandoned this revision. labath marked an inline comment as done. labath added a comment.
Abandoning. I'll create separate patches with the new implementation. ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- JDevlieghere wrote: > labath wrote: > > dblaikie wrote: > > > labath wrote: > > > > dblaikie wrote: > > > > > labath wrote: > > > > > > dblaikie wrote: > > > > > > > labath wrote: > > > > > > > > dblaikie wrote: > > > > > > > > > labath wrote: > > > > > > > > > > This parallel iteration is not completely nice, but I think > > > > > > > > > > it's worth being able to reuse the absolute range > > > > > > > > > > computation code. I'm open to ideas for improvement though. > > > > > > > > > Ah, I see - this is what you meant about "In particular it > > > > > > > > > makes it possible to reuse this stuff in the dumping code, > > > > > > > > > which would have been pretty hard with callbacks.". > > > > > > > > > > > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A > > > > > > > > > full iterator abstraction for one user here (well, two once > > > > > > > > > you include lldb - but I assume it's likely going to build > > > > > > > > > its own data structure from the iteration anyway, right? > > > > > > > > > (it's not going to keep the iterator around, do anything > > > > > > > > > interesting like partial iterations, re-iterate/etc - such > > > > > > > > > that a callback would suffice)) > > > > > > > > > > > > > > > > > > I could imagine two callback APIs for this - one that gets > > > > > > > > > entries and locations and one that only gets locations by > > > > > > > > > filtering on the entry version. > > > > > > > > > > > > > > > > > > eg: > > > > > > > > > > > > > > > > > > // for non-verbose output: > > > > > > > > > LL.forEachEntry([&](const Entry &E, Expected<DWARFLocation> > > > > > > > > > L) { > > > > > > > > > if (Verbose && actually dumping debug_loc) > > > > > > > > > print(E) // print any LLE_*, raw parameters, etc > > > > > > > > > if (L) > > > > > > > > > print(*L) // print the resulting address range, section > > > > > > > > > name (if verbose), > > > > > > > > > else > > > > > > > > > print(error stuff) > > > > > > > > > }); > > > > > > > > > > > > > > > > > > One question would be "when/where do we print the DWARF > > > > > > > > > expression" - if there's an error computing the address > > > > > > > > > range, we can still print the expression, so maybe that > > > > > > > > > happens unconditionally at the end of the callback, using the > > > > > > > > > expression in the Entry? (then, arguably, the expression > > > > > > > > > doesn't need to be in the DWARFLocation - and I'd say make > > > > > > > > > the DWARFLocation a sectioned range, exactly the same type as > > > > > > > > > for ranges so that part of the dumping code, etc, can be > > > > > > > > > maximally reused) > > > > > > > > Actually, what lldb currently does is that it does not build > > > > > > > > any data structures at all (except storing the pointer to the > > > > > > > > right place in the debug_loc section. Then, whenever it wants > > > > > > > > to do something to the loclist, it parses it afresh. I don't > > > > > > > > know why it does this exactly, but I assume it has something to > > > > > > > > do with most locations never being used, or being only a couple > > > > > > > > of times, and the actual parsing being fairly fast. What this > > > > > > > > means is that lldb is not really a single "user", but there are > > > > > > > > like four or five places where it iterates through the list, > > > > > > > > depending on what does it actually want to do with it. It also > > > > > > > > does partial iteration where it stops as soon as it find the > > > > > > > > entry it was interested in. > > > > > > > > Now, all of that is possible with a callback (though I am > > > > > > > > generally trying to avoid them), but it does resurface the > > > > > > > > issue of what should be the value of the second argument for > > > > > > > > DW_LLE_base_address entries (the thing which I originally used > > > > > > > > a error type for). > > > > > > > > Maybe this should be actually one callback API, taking two > > > > > > > > callback functions, with one of them being invoked for > > > > > > > > base_address entries, and one for others? However, if we stick > > > > > > > > to the current approaches in both LLE and RLE of making the > > > > > > > > address pool resolution function a parameter (which I'd like to > > > > > > > > keep, as it makes my job in lldb easier), then this would > > > > > > > > actually be three callbacks, which starts to get unwieldy. > > > > > > > > Though one of those callbacks could be removed with the > > > > > > > > "DWARFUnit implementing a AddrOffsetResolver interface" idea, > > > > > > > > which I really like. :) > > > > > > > Ah, thanks for the details on LLDB's location parsing logic. > > > > > > > That's interesting indeed! > > > > > > > > > > > > > > I can appreciate an iterator-based API if that's the sort of > > > > > > > usage we've got, though I expect it doesn't have any interest in > > > > > > > the low-level encoding & just wants the fully processed address > > > > > > > ranges/locations - it doesn't want base_address or end_of_list > > > > > > > entries? & I think the dual-iteration is a fairly awkward API > > > > > > > design, trying to iterate them in lock-step, etc. I'd rather > > > > > > > avoid that if reasonably possible. > > > > > > > > > > > > > > Either having an iterator API that gives only the fully processed > > > > > > > data/semantic view & a completely different API if you want to > > > > > > > access the low level primitives (LLE, etc) (this is how ranges > > > > > > > works - there's an API that gives a collection of ranges & > > > > > > > abstracts over v4/v5/rnglists/etc - though that's partly > > > > > > > motivated by a strong multi-client need for that functionality > > > > > > > for symbolizing, etc - but I think it's a good abstraction/model > > > > > > > anyway (& one of the reasons the inline range list printing > > > > > > > doesn't include encoding information, the API it uses is too high > > > > > > > level to even have access to it)) > > > > > > > > > > > > > > > Now, all of that is possible with a callback (though I am > > > > > > > > generally trying to avoid them), but it does resurface the > > > > > > > > issue of what should be the value of the second argument for > > > > > > > > DW_LLE_base_address entries (the thing which I originally used > > > > > > > > a error type for). > > > > > > > > > > > > > > Sorry, my intent in the above API was for the second argument to > > > > > > > be Optional's "None" state when... oh, I see, I did use Expected > > > > > > > there, rather than Optional, because there are legit error cases. > > > > > > > > > > > > > > I know it's sort of awkward, but I might be inclined to use > > > > > > > Optional<Expected<AddressRange>> there. I realize two layers of > > > > > > > wrapping is a bit weird, but I think it'd be nicer than having an > > > > > > > error state for what, I think, isn't erroneous. > > > > > > > > > > > > > > > Maybe this should be actually one callback API, taking two > > > > > > > > callback functions, with one of them being invoked for > > > > > > > > base_address entries, and one for others? However, if we stick > > > > > > > > to the current approaches in both LLE and RLE of making the > > > > > > > > address pool resolution function a parameter (which I'd like to > > > > > > > > keep, as it makes my job in lldb easier), then this would > > > > > > > > actually be three callbacks, which starts to get unwieldy. > > > > > > > > > > > > > > Don't mind three callbacks too much. > > > > > > > > > > > > > > > Though one of those callbacks could be removed with the > > > > > > > > "DWARFUnit implementing a AddrOffsetResolver interface" idea, > > > > > > > > which I really like. :) > > > > > > > > > > > > > > Sorry, I haven't really looked at where the address resolver > > > > > > > callback is registered and alternative designs being discussed - > > > > > > > but yeah, going off just the one-sentence, it seems reasonable to > > > > > > > have the DWARFUnit own an address resolver/be the thing you > > > > > > > consult when you want to resolve an address (just through a > > > > > > > normal function call in DWARFUnit, perhaps - which might, > > > > > > > internally, use a callback registered when it was constructed). > > > > > > > I know it's sort of awkward, but I might be inclined to use > > > > > > > Optional<Expected<AddressRange>> there. I realize two layers of > > > > > > > wrapping is a bit weird, but I think it'd be nicer than having an > > > > > > > error state for what, I think, isn't erroneous. > > > > > > Actually, my very first attempt at this patch used an > > > > > > `Expected<Optional<Whatever>>`, but then I scrapped it because I > > > > > > didn't think you'd like it. It's not the friendliest of APIs, but I > > > > > > think we can go with that. > > > > > > > > > > > > > Sorry, I haven't really looked at where the address resolver > > > > > > > callback is registered and alternative designs being discussed - > > > > > > > but yeah, going off just the one-sentence, it seems reasonable to > > > > > > > have the DWARFUnit own an address resolver/be the thing you > > > > > > > consult when you want to resolve an address (just through a > > > > > > > normal function call in DWARFUnit, perhaps - which might, > > > > > > > internally, use a callback registered when it was constructed). > > > > > > > > > > > > I think you got that backwards. I don't want the DWARFUnit to be > > > > > > the source of truth for address pool resolutions, as that would > > > > > > make it hard to use from lldb (it's far from ready to start using > > > > > > the llvm version right now). What I wanted was to replace the > > > > > > lambda/function_ref with a single-method interface. Then both > > > > > > DWARFUnits could implement that interface so that passing a > > > > > > DWARFUnit& would "just work" (but you wouldn't be limited to > > > > > > DWARFUnits as anyone could implement that interface, just like > > > > > > anyone can write a lambda). > > > > > As for Expected<Optional<Whatever>> (or Optional<Expected<>>) - yeah, > > > > > I think this is a non-obvious API (both the general problem and this > > > > > specific solution). I think it's probably worth discussing this > > > > > design a bit more to save you time writing/rewriting things a bit. I > > > > > guess there are a few layers of failure here. > > > > > > > > > > There's the possibility that the iteration itself could fail - even > > > > > for debug_loc style lists (if we reached the end of the section > > > > > before encountering a terminating {0,0}). That would suggest a > > > > > fallible iterator idiom: > > > > > http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges > > > > > > > > > > But then, yes, when looking at the "processed"/semantic view, that > > > > > could fail too in the case of an invalid address index, etc. > > > > > > > > > > The generic/processed/abstracted-over-ranges-and-rnglists API for > > > > > ranges produces a fully computer vector (& then returns > > > > > Expected<vector> of that range) - is that reasonable? (this does mean > > > > > manifesting a whole location in memory, which may not be needed so I > > > > > could understand avoiding that even without fully implementing & > > > > > demonstrating the vector solution is inadequate). > > > > > > > > > > But I /think/ maybe the we could/should have two APIs - one generic > > > > > API that abstracts over loc/loclists and only provides the fully > > > > > processed view, and another that is type specific for dumping the > > > > > underlying representation (only used in dumping debug_loclists). > > > > If we were computing the final address ranges from scratch (which would > > > > be the best match for the current lldb usage, but which I am not > > > > considering now for fear of changing too many things), then I agree > > > > that we would need the fallible_iterator iterator thingy. But in this > > > > case we are "interpreting" the already parsed ranges, so we can assume > > > > some level of correctness here, and the thing that can fail is only the > > > > computation of a single range, which does not affect our ability to > > > > process the next entry. > > > > This indicates to me that either each entry in the list should be an > > > > Expected<>, or that the invalid entries should be just dropped > > > > (possibly accompanied by some flag which would tell the caller that the > > > > result was not exhaustive). > > > > > > > > This is connected to one of the issues I have with the debug ranges API > > > > -- it tries _really_ hard to return *something* -- if resolving the > > > > indirect base address entry fails, it is perfectly happy to use the > > > > address _index_ as the base address. This makes sense for dumping, > > > > where you want to show something (though it would still be good to > > > > indicate that you're not showing a real address), but it definitely > > > > does not help consumers which then need to make decisions based on the > > > > returned data. > > > > > > > > Anyway, yes, I agree that we need to APIs, and probably callbacks are > > > > the easiest way to achieve that. We could have a "base" callback that > > > > is not particularly nice to use, but provides the full information via > > > > a combination of `UnparsedLL` and `Optional<Expected<ParsedLL>>` > > > > arguments. The dumper could use that to print out everything it needs. > > > > And then we could have a second API, built on top of the first one, > > > > which ignores base address entries and the raw data and returns just a > > > > bunch of `Expected<ParsedLL>`. This could be used by users like lldb, > > > > who just want to see the final data. The `ParsedLL` type would be > > > > independent of the location list type, so that the debug_loc parser > > > > could provide the same kind of API (but implemented on top of something > > > > else, as the `UnparsedLL` types would differ). Also, under the hood, > > > > the location list dumper for debug_loclists (but not debug_loc) could > > > > reuse some implementation details with the debug_rnglists dumper via a > > > > suitable combination of templates and callbacks. > > > > > > > > How does that sound? > > > What sort of things are you concerned about with deeper API changes here? > > > I think it's probably worth building the "right" thing now - as good a > > > time as any. LLVM's debug info APIs, as you've pointed out, aren't > > > exactly "sturdy" (treating address indexes as offsets, etc, etc), so no > > > time like the present to clean it up. > > > > > > I think if we had an abstraction over v4 and v5 location descriptions, > > > parsing from scratch, fallible iterators, etc - that'd be the ideal thing > > > to use in the inline dumping code (that dumps inside debug_info) - which > > > currently uses "parseOneLocationList" - so it is parsing from scratch and > > > dumping. > > > > > > But equally I understand not wanting to make you/me/anyone fix everything > > > when just trying to get something actually done. > > I think I am mainly concerned with the scope explosion of doing changes > > like that. I don't know how well founded is that concern, because I don't > > know all the use cases for this information right now (but that's a part of > > the problem). But anyway, let's continue the discussion. > > > > The main problem I see with "inline" dumping using some higher-level > > abstraction is what should one do in case of incomplete data (e.g. dwo > > files). If this abstraction returns "cooked" data, then the inline dump > > could not show anything for dwo files dumped standalone. This is perfectly > > fine for lldb and maybe other tools (which never look at dwo files > > separately, and mostly don't care about the reason why the "cooking" failed > > -- if they just need to get the data it can rely on), but for something > > like llvm-dwarfdump, we'd probably want to display _something_. I guess > > this is why the ranges api returns indexes as offsets, but I think we agree > > we don't want that. > > > > Then the question is what should that "something" be? If it's going to be a > > raw dump of the location list entry, then solution would not be fully > > generic, as they raw entry type will depend on the location list kind. > > Though, we could still arrange it so that the various location lists can be > > processed uniformly via a template if they e.g. have a dump() function with > > the same signature... > > > > Suppose we go down that path (this is the path I wanted to go down in my > > previous comment, modulo the parse-from-scratch part). The question then is > > what to do with the section-based dumping. Should it use the same > > mechanism? It probably should, because the first callback/iterator will > > provide all data it can possibly want. But then, should it also build the > > parsed representation like it does now? > > If yes, then what for? Should we also build some mechanism to "cook" that > > data too. > > If not, are we ok with having all users reparse from scratch always? (There > > are only two users I found right now, and they look like they'd be fine > > with it.) > > Or should the DWARFDebugLoclists object cache the cooked data instead? > > llvm-dwarfdump --statistics would actually prefer that, but that might make > > the DWARF verifier sad (though I guess it could always reparse from scratch > > if needed). > I'm very late to the discussion and I'm not as familiar as both of you with > the details and the API uses, so please ignore my suggestion if it doesn't > make any sense... > Could we have an API where we parse a *list* of Uncooked (to reuse Pavel's > nomenclature) and then have the ability to resolve each Uncooked entry into a > Cooked entry? Then both LLDB and dwarfdump could get the list of Uncooked > entries and try to get the cooked variant. If that works, great, we dump/use > that, if not we move on and have separate failure modes for errors > originating from parsing and from cooking. We've discussed this with David last week, and we have hopefully agreed on the rough direction forward (and I think it's going to roughly correspond to what you had in mind). I'm preparing a patch (first of many) to implement that and I'm hoping I'll be able to upload something today. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits