Hi John, On Fri, 2021-09-10 at 10:49 -0500, John Mellor-Crummey via Elfutils- devel wrote: > My previous patch submission seems to have been overlooked as > buildbot issues consumed several days this week. However, discussion > in the mailing list now seems to have moved on beyond my submission > and I would like my patch considered. Here, I echo my previous > submission, except I improved my submission by including the prefix > [PATCH] in the subject and I included the patch in the message body > rather than as an attachment.
Sorry for the buildbot noise, that was indeed a little painful. But a buildbot that randomly fails is not much fun, so it took highest priority to get it back to green. Your submission is really nice, having extensive documentation, all features implemented, a testcase. Well done. There are however some concerns outside your control. It is somewhat disappointing NVIDIA didn't document this themselves, or tried to standardize this. You seems to have a good grasp of what was intended though. We have to be careful not to extend the api in a way that makes a better standard/implementation impossible. And the way we implemented Dwarf_Lines isn't ideal, so this extension shows we should maybe change it to be more efficient/compact. But maybe we can do that after adding the extension, we should however have a plan. > Regarding the DWARF proposal by Cary Coutant for two-level linemaps: > I now believe that NVIDIA’s implementation is consistent with that > proposal although NVIDIA uses a DWARF extended opcode for inlined > calls whereas Cary’s proposal uses DW_LNS_inlined_call (a standard > opcode), NVIDIA’s implementation uses DW_LNE_inlined_call (an > extended opcode). The naming is one of the concerns. It would be better to use a name like DW_LNE_NVIDIA_inlined_call and DW_LNE_NVIDIA_set_function_name to show they are vendor extensions and don't clash with possible future standard opcode names. That it mimics the two-level linemaps proposal is a good thing. But lets make sure that the new accessor functions, dwarf_linecontext and dwarf_linefunctionname, are generic enough that they can hopefully be reused when two-level linemaps or a similar proposal becomes a standard. > A note about the source code for the test case reading the extended > linemap entries using libdw: this code was copied from another test > that used dwarf_next_lines and extended with code that reads the new > context and functionname fields of a line table entry. Thanks for the test case, it makes clear how the new functionality can be used. How was the test binary, testfile_nvidia_linemap, generated? That should probably be documented inside the testcase. I won't be able to review all the code right now, but here are some high-level comments, so you know what I am thinking. On Sep 5, 2021, at 7:07 PM, John Mellor-Crummey <joh...@rice.edu> > > wrote: > > > > As of CUDA 11.2, NVIDIA added extensions to the line map section > > of CUDA binaries to represent inlined functions. These extensions > > include > > > > - two new fields in a line table row to represent inline > > information: context, and functionname, We didn't design the Dwarf_Line_s very well/compact. We already have the op_index, isa and discriminator fields which are almost never used. This adds two more. I wonder if we can split the struct up so that the extra fields are only used when actually used. Maybe this can be done with a flag in Dwarf_Lines_s indicating whether the array contains only the basic line attributes or also some extended values. Of course this makes dwarf_getsrclines even more complex because it then has to expand the line state struct whenever it first sees an extended attribute. But I really like to see if we can use less memory here. If we agree on some way to make that possible we can implement it afterwards. > > - two new DWARF extended opcodes: DW_LNE_inlined_call, > > DW_LNE_set_function_name, Like I said above I think these names should contain the "vendor" name DW_LNE_NVIDIA_... > > - an additional word in the line table header that indicates > > the offset in the .debug_str function where the function > > names for this line table begin, and This is the only part I think is somewhat tricky. I believe you implement it cleverly by checking the header_length. And your implementation should work, but it is also the part that no other tool understands, which means any such binary cannot really be handled by any other tool. And I don't really understand how this works when linking objects with such a offset into .debug_str. Normally the .debug_str section contains strings that can be merged, but that would disrupt the order, so I don't fully understand how the function_name indexes are kept correct. This would have been nicer as a DWARF5 extension, where there is already a .debug_str_offsets section (but also confusing because there is also a special .debug_line_str section where these strings should probably point at). > > - two new functions in the libdw API: dwarf_linecontext and > > dwarf_linefunctionname, which return the new line table fields. I think it would be nice if we could make these functions return a Dwarf_Line * and a const char * to make them future proof in case a standard extension encodes these differently. Cheers, Mark