Re: [PATCH] read inlining info in an NVIDIA extended line map (was: Extension ...)

Mark Wielaard Fri, 10 Sep 2021 10:11:46 -0700

Hi John,

On Fri, 2021-09-10 at 10:49 -0500, John Mellor-Crummey via Elfutils-
devel wrote:
> My previous patch submission seems to have been overlooked as
> buildbot issues consumed several days this week. However, discussion
> in the mailing list now seems to have moved on beyond my submission
> and I would like my patch considered. Here, I echo my previous
> submission, except I improved my submission by including the prefix
> [PATCH] in the subject and I included the patch in the message body
> rather than as an attachment.

Sorry for the buildbot noise, that was indeed a little painful. But a
buildbot that randomly fails is not much fun, so it took highest
priority to get it back to green.

Your submission is really nice, having extensive documentation, all
features implemented, a testcase. Well done.

There are however some concerns outside your control. It is somewhat
disappointing NVIDIA didn't document this themselves, or tried to
standardize this. You seems to have a good grasp of what was intended
though. We have to be careful not to extend the api in a way that makes
a better standard/implementation impossible. And the way we implemented
Dwarf_Lines isn't ideal, so this extension shows we should maybe change
it to be more efficient/compact. But maybe we can do that after adding
the extension, we should however have a plan.

> Regarding the DWARF proposal by Cary Coutant for two-level linemaps:
> I now believe that NVIDIA’s implementation is consistent with that
> proposal although NVIDIA uses a DWARF extended opcode for inlined
> calls whereas Cary’s proposal uses DW_LNS_inlined_call (a standard
> opcode), NVIDIA’s implementation uses DW_LNE_inlined_call (an
> extended opcode).

The naming is one of the concerns. It would be better to use a name
like DW_LNE_NVIDIA_inlined_call and DW_LNE_NVIDIA_set_function_name to
show they are vendor extensions and don't clash with possible future
standard opcode names.

That it mimics the two-level linemaps proposal is a good thing. But
lets make sure that the new accessor functions, dwarf_linecontext and
dwarf_linefunctionname, are generic enough that they can hopefully be
reused when two-level linemaps or a similar proposal becomes a
standard.

> A note about the source code for the test case reading the extended
> linemap entries using libdw: this code was copied from another test
> that used dwarf_next_lines and extended with code that reads the new
> context and functionname fields of a line table entry.

Thanks for the test case, it makes clear how the new functionality can
be used. How was the test binary, testfile_nvidia_linemap, generated?
That should probably be documented inside the testcase.

I won't be able to review all the code right now, but here are some
high-level comments, so you know what I am thinking.

On Sep 5, 2021, at 7:07 PM, John Mellor-Crummey <[email protected]>
> > wrote:
> > 
> > As of CUDA 11.2, NVIDIA added extensions to the line map section
> > of CUDA binaries to represent inlined functions. These extensions
> > include
> > 
> > - two new fields in a line table row to represent inline 
> >   information: context, and functionname,

We didn't design the Dwarf_Line_s very well/compact. We already have
the op_index, isa and discriminator fields which are almost never used.
This adds two more. I wonder if we can split the struct up so that the
extra fields are only used when actually used.

Maybe this can be done with a flag in Dwarf_Lines_s indicating whether
the array contains only the basic line attributes or also some extended
values. Of course this makes dwarf_getsrclines even more complex
because it then has to expand the line state struct whenever it first
sees an extended attribute. But I really like to see if we can use less
memory here. If we agree on some way to make that possible we can
implement it afterwards.

> > - two new DWARF extended opcodes: DW_LNE_inlined_call, 
> >   DW_LNE_set_function_name,

Like I said above I think these names should contain the "vendor" name
DW_LNE_NVIDIA_...

> > - an additional word in the line table header that indicates 
> >   the offset in the .debug_str function where the function 
> >   names for this line table begin, and

This is the only part I think is somewhat tricky. I believe you
implement it cleverly by checking the header_length. And your
implementation should work, but it is also the part that no other tool
understands, which means any such binary cannot really be handled by
any other tool. And I don't really understand how this works when
linking objects with such a offset into .debug_str. Normally the
.debug_str section contains strings that can be merged, but that would
disrupt the order, so I don't fully understand how the function_name
indexes are kept correct. This would have been nicer as a DWARF5
extension, where there is already a .debug_str_offsets section (but
also confusing because there is also a special .debug_line_str section
where these strings should probably point at).

> > - two new functions in the libdw API: dwarf_linecontext and 
> >   dwarf_linefunctionname, which return the new line table fields.

I think it would be nice if we could make these functions return a
Dwarf_Line * and a const char * to make them future proof in case a
standard extension encodes these differently.

Cheers,

Mark

Re: [PATCH] read inlining info in an NVIDIA extended line map (was: Extension ...)

Reply via email to