Re: [lldb-dev] RFC: Moving debug info parsing out of process

J.R. Heisey via lldb-dev Fri, 01 Mar 2019 22:44:17 -0800

Hi Guys,

https://wiki.eclipse.org/TCF#Where_can_I_Read_Documentation.3F

I hope you don't mind me chiming in I've been following this thread. Iam a little familiar with the Eclipse Target Communications Framework (TCF).

------------------------------------------------------------------------

From https://www.eclipse.org/tcf/

TCF is a vendor-neutral lightweight, extensible network protocol fordriving embedded systems (targets).

On top of the protocol, TCF provides a complete modern debugger forC/C++ and Ada, as well as the "Target Explorer" for system management.TCF works out of the box for Intel, PPC and ARM Linux targets includingthe Raspberry Pi. It supports Proxying and Tunneling for IoT devices,and is particularly strong for multi-process debugging even with slowcommunication links.


------------------------------------------------------------------------

Wind River was one of the original developers. The TCF specificationdefines a set of services one of which is a symbols service. Theprotocol was designed to support asynchronous communications. It hasbeen around a while. Eclipse contains a Java client plug inimplementation and there is an example 'TCF Agent' which is a serverimplementation in C. For more details you can read up here.


https://wiki.eclipse.org/TCF#Where_can_I_Read_Documentation.3F

I notice on the LLDB project page http://lldb.llvm.org/projects.htmlitem "3 Make a high speed asynchronous communication channel to replacethe gdb-remote protocol".


The full TCF specification can also replace the MI.

Thanks,

J.R.

On 3/1/2019 13:43, Zachary Turner via lldb-dev wrote:

On Wed, Feb 27, 2019 at 4:35 PM Frédéric Riss <[email protected]<mailto:[email protected]>> wrote:

    On Feb 27, 2019, at 3:14 PM, Zachary Turner <[email protected]
    <mailto:[email protected]>> wrote:



    On Wed, Feb 27, 2019 at 2:52 PM Frédéric Riss <[email protected]
    <mailto:[email protected]>> wrote:

        On Feb 27, 2019, at 10:12 AM, Zachary Turner
        <[email protected] <mailto:[email protected]>> wrote:

        For what it's worth, in an earlier message I mentioned that
        I would probably build the server by using mostly code from
        LLVM, and making sure that it supported the union of things
        currently supported by LLDB and LLVM's DWARF parsers.  Doing
        that would naturally require merging the two (which has been
        talked about for a long time) as a pre-requisite, and I
        would expect that for testing purposes we might want
        something like llvm-dwarfdump but that dumps a higher level
        description of the information (if we change our DWARF
        emission code in LLVM for example, to output the exact same
        type in slightly different ways in the underlying DWARF, we
        wouldn't want our test to break, for example).  So for
        example imagine you could run something like `lldb-dwarfdump
        -lookup-type=foo a.out` and it would dump some description
        of the type that is resilient to insignificant changes in
        the underlying DWARF.


        At which level do you consider the “DWARF parser” to stop and
        the debugger policy to start? In my view, the DWARF parser
        stop at the DwarfDIE boundary. Replacing it wouldn’t get us
        closer to a higher-level abstraction.

    At the level where you have an alternative representation that
    you no longer have to access to the debug info.  In LLDB today,
    this "representation" is a combination of LLDB's own internal
    symbol hierarchy (e.g. lldb_private::Type,
    lldb_private::Function, etc) and the Clang AST.  Once you have
    constructed those 2 things, the DWARF parser is out of the picture.

    A lot of the complexity in processing raw DWARF comes from
    handling different versions of the DWARF spec (e.g. supporting
    DWARF 4 & DWARF 5), collecting and interpreting the subset of
    attributes which happens be present, following references to
    other parts of the DWARF, and then at the end of all this (or
    perhaps during all of this), dealing with "partial information"
    (e.g. something that would have saved me a lot of trouble was
    missing, now I have to do extra work to find it).

    I'm treading DWARF expressions as an exception though, because it
    would be somewhat tedious and not provide much value to convert
    those into some text format and then evaluate the text
    representation of the expression since it's already in a format
    suitable for processing.  So for this case, you could just encode
    the byte sequence into a hex string and send that.

    I hinted at this already, but part of the problem (at least in my
    mind) is that our "DWARF parser" is intermingled with the code
    that *interprets the parsed DWARF*.  We parse a little bit, build
    something, parse a little bit more, add on to the thing we're
    building, etc.  This design is fragile and makes error handling
    difficult, so part of what I'm proposing is a separation here,
    where "parse as much as possible, and return an intermediate
    representation that is as finished as we are able to make it".

    This part is independent of whether DWARF parsing is out of
    process however.  That's still useful even if DWARF parsing is in
    process, and we've talked about something like that for a long
    time, whereby we have some kind of API that says "give me the
    thing, handle all errors internally, and either return me a thing
    which I can trust or an error".  I'm viewing "thing which I can
    trust" as some representation which is separate from the original
    DWARF, and which we could test -- for example -- by writing a
    tool which dumps this representation


    Ok, here we are talking about something different (which you might
    have been expressing since the beginning and I misinterpreted). If
    you want to decouple dealing with DIEs from creating ASTs as a
    preliminary, then I think this would be super valuable and it
    addresses my concerns about duplicating the AST creation logic.

    I’m sure Greg would have comments about the challenges of lazily
    parsing the DWARF in such a design.

Well, I was originally talking about both lumped into one thing. Because this is a necessary precursor to having it be out of process :)

Since we definitely agree on this portion, the question then becomes:Suppose we have this firm API boundary across which we either returnerrors or things that can be trusted. What are the things which canbe trusted? Are they DIEs? I'm not sure they should be, because we'dhave to synthesize DIEs on the fly in the case where we got somethingthat was bad but we tried to "fix" it (in order to sanitize the debuginfo into something the caller can make basic assumptions about). Andadditionally, it doesn't really make the client's job much easier asfar as parsing goes.

So, I think it should build up a little bit higher representation ofthe debug info, perhaps by piecing together information from multipleDIEs and sources, and return that. Definitely laziness will have tobe maintained, but I don't think that's inherently more difficult witha design where we return something higher level than DIEs.


Thoughts?

_______________________________________________
lldb-dev mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] RFC: Moving debug info parsing out of process

Reply via email to