On 1/21/21 2:46 PM, Jan Hubicka wrote:
I think easy way to get users of this option is to make profile not
reproducible by default and modify packages to use right reproducibility
option when reproducible builds are intended.  It is not feature that
comes for free and I think most users of PGO does not care, so I think
it should be opt in.

I agree that most users really don't care.


In general getting profile reroducible one needs to make train
reproducible that is hard when you look at details (such as/tmp/  file
name generation issue in gcc) and may lead to need for user to annotate
such code.

Yes, right now I'm testing both patches and I still see difference in GCC PGO
bootstrap (with -fprofile-reproducible=parallel-runs):

$ objfolderdiff.py /dev/shm/objdir2/gcc /dev/shm/objdir3/gcc
   138/   649: cgraphunit.o: different
   230/   649: dwarf2out.o: different
   356/   649: ipa-cp.o: different
   357/   649: ipa-devirt.o: different
   360/   649: ipa-icf.o: different
   533/   649: tree-affine.o: different
   574/   649: tree-ssa-loop-im.o: different
   632/   649: var-tracking.o: different

Most of the changes are in known contexts:

;; Function hash_table<hash_map<im_mem_ref*, sm_aux*>::hash_entry, false, 
xcallocator>::find_with_hash 
(_ZN10hash_tableIN8hash_mapIP10im_mem_refP6sm_aux21simple_hashmap_traitsI19default_hash_traitsIS2_ES4_EE10hash_entryELb0E11xcallocatorE14find_with_hashERKS2_j,
 funcdef_no=3431, decl_uid=106116, cgraph_uid=2554, symbol_order=2712)

;; Function hash_table<default_hash_traits<void*>, false, 
xcallocator>::find_empty_slot_for_expand 
(_ZN10hash_tableI19default_hash_traitsIPvELb0E11xcallocatorE26find_empty_slot_for_expandEj, 
funcdef_no=4875, decl_uid=134656, cgraph_uid=3901, symbol_order=4074)

;; Function hash_table_mod2 (_Z15hash_table_mod2jj, funcdef_no=1047, 
decl_uid=32691, cgraph_uid=345, symbol_order=356)

;; Function hash_table<hash_map<tree_node*, name_expansion*>::hash_entry, false, 
xcallocator>::find_empty_slot_for_expand 
(_ZN10hash_tableIN8hash_mapIP9tree_nodeP14name_expansion21simple_hashmap_traitsI19default_hash_traitsIS2_ES4_EE10hash_entryELb0E11xcallocatorE26find_empty_slot_for_expandEj,
 funcdef_no=3639, decl_uid=102478, cgraph_uid=2760, symbol_order=2916)

So likely hashing related functions where we hash some pointers :/ 
Unfortunately, that's enough for final binary
divergence.


This will become more common problem for multithreaded profiles where
one needs to annotate locking and busy waiting loops in them for example
(or the scheduler responsible for executing paralle tasks).

I can see this to be practically achievable but we probably want to
produce some guidelines for doing that and probably teach gcov-tool to
compare profiles and say to which degree they match (i.e. which
functions match for each of levels of reproducibility).

The problem is that profiles are continuous and the errors too, but
optimizaitons looks for certain thresholds, so small errors may lead to
code changes, so I think our current method of looking at relatively few
packages and patching errors when they appear is not very good long term
strategy... Especially if it makes us to drop useful transformations by
default with -fprofile-use and no additional option.

To be honest we have very few packages that use PGO in openSUSE:Factory.

Anyway, are you fine with the suggested?

Thanks for discussion,
Martin

Reply via email to