Hi Aleksei,
On Fri, Nov 17, 2023 at 10:35:40PM +0000, [email protected] wrote:
> When archive is processed in process_archive (libdwfl/offline.c), it
> creates an Elf object for each archive member. Then in
> process_archive_member it calls process_file to create a Dwfl_Module
> through __libdwfl_report_elf.
>
> The ownership of the Elf object is expected to be:
>
> * either transfered to the Dwfl_Module, if __libdwfl_report_elf returns
> not NULL;
>
> * or handled at the end of process_archive_member by calling elf_end.
>
> Moreover, Elf object is expected to be alive, if __libdwfl_report_elf
> returns not NULL, because at the end of process_archive_member it
> advances to the next member through the elf_next call.
>
> The problem happens when __libdwfl_report_elf encounters Elf with the
> same name and content as it seen before. In that case dwfl_report_module
> will reuse existing Dwfl_Module object. This leads to a codepath that
> calls elf_end on the Elf object, while returning not NULL, breaking the
> elf_next call to the next member.
>
> The fix is to destroy m->main.elf instead and put the new Elf object in
> the already existing Dwfl_Module.
>
> * libdwfl/dwfl_report_elf.c (__libdwfl_report_elf): Replace Elf in
> the Dwfl_Module in case of overlapping or duplicate modules to
> prolong its lifetime for subsequent processing.
Thanks for that analysis and proposed solution. The ownership
reasoning makes sense. But I do have one question.
> diff --git a/libdwfl/dwfl_report_elf.c b/libdwfl/dwfl_report_elf.c
> index 581f4079..58b06aea 100644
> --- a/libdwfl/dwfl_report_elf.c
> +++ b/libdwfl/dwfl_report_elf.c
> @@ -276,7 +276,8 @@ __libdwfl_report_elf (Dwfl *dwfl, const char *name, const
> char *file_name,
> }
> else
> {
> - elf_end (elf);
> + elf_end (m->main.elf);
> + m->main.elf = elf;
> if (m->main_bias != bias
> || m->main.vaddr != vaddr || m->main.address_sync != address_sync)
> goto overlap;
If we goto overlap here don't we still have a problem? overlap will
set m->gc = true; and return NULL. So the caller will think they
still owns the elf handle and will probably close it. But then when
the module is GCed in dwfl_report_end it will close the elf handle
again.
Should we instead move the elf_end and reassignment of main.elf to
after this if statement?
Thanks,
Mark