On 15/10/25 17:45, Jan Hubicka wrote:
External email: Use caution opening links or attachments
Hi Honza,
I think in this case we need to be sure that instance for a.c:foo is
saved with filename_index == 0 (or -1, since you seem to use it for
<unknown>) while internal instances such as b.c:foo is saved with
filename_index != 0.
For example indirect call in c.c may call both a.c:foo and b.c:foo so we
can not really distinguish them by notion of current file.
The assumption that I have made in this patch is that all symbols, regardless
of being internal or external, have filenames associated with them in the
DWARF info and in the auto-profile pass. This allows me to compare them
without having to worry about them being internal or external.
In that case both a.c:foo and b.c:foo will have unique filename indices
associated with them. This will also mean that c.c:foo will have a unique
filename index as well. As long as the compiler marks DECL_SOURCE_FILE with
the correct filename to look up for each call, this should work in theory.
I'm not sure how well this holds up in practice given that we have been
seeing cases where symbols have filenames completely missing and instances
are not getting marked correctly, so I think we do need to take a look at
this assumption again.
Thinking of it, if we make an assumption that single
<symbol_name>:<source_file> pair is either always public or always
internal in the whole program, then we can do merging correctly, since
we always only care about symbols defined in current translation unit so
we know if it is public or private.
However in that case offline pass can simply take away the filenames.
It already handles renaming for suffixes and dwarf names and merges
duplicate instances. While doing so it can simply take away file info
for public symbols (where symbol_name is unique) wihtout any need to
introduce extra maps. The symbol name multimap we already have should be
enough. But still I think autofdo tool should simply look for
DW_AT_external and not straeam out.
Can you, please, separate the streaming code form the patch and send it
as a separate patch? (so function instances are read with both symbol
names and file names, but file names remains unused). I can then quite
easily implement the renaming part on the top of it if needed.
Concering the missing file names, I think I fixed that bug in autofdo at
my side. It was incorrectly reading dwarf5 abbreviations.
Pairs <symbol_name>:<source_file> are not necesarily unique either,
while <symbol_name>:<translation_unit_name> are.
Jakub, I wonder if we can easily look up the translation unit name and
arrange it to be stable across different build environments?
Honza
Hi Honza,
I have split up the patch into a patch series and sent it at
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/697767.html.
I have refactored the code to try and use symbol_name/filename(/file_name)
more uniformly across the entire pass. This does make the diff quite a bit
larger but I hope the code is easier to read now. I have also switched to
a new class function_instance::descriptor to be the key of the map in
source_profile instead of the two-level map that was there before.
I have also updated the AutoFDO tools patch to write out the entire file path
to the GCOV file (https://github.com/google/autofdo/pull/244), which is then
normalized based on param_profile_func_internal_id.
--
Regards,
Dhruv