https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121038
Bug ID: 121038 Summary: autoprofiledbootstrap is broken in few ways Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: hubicka at gcc dot gnu.org Target Milestone: --- Lets track the problems here. Currently 1) autoprofiledbootstrap fails for me at 256 core machine since perf runs out of memory Workaround is: diff --git a/Makefile.tpl b/Makefile.tpl index 21cd95e1bc8..cbf80544cc1 100644 --- a/Makefile.tpl +++ b/Makefile.tpl @@ -416,7 +416,7 @@ MAKEINFO = @MAKEINFO@ EXPECT = @EXPECT@ RUNTEST = @RUNTEST@ -AUTO_PROFILE = gcc-auto-profile --all -c 10000000 -o perf.data +AUTO_PROFILE = gcc-auto-profile --all -c 10000000 -o perf.data -m10 # This just becomes part of the MAKEINFO definition passed down to # sub-makes. It lets flags be given on the command line while still 2) if --with-build-config=bootstrap-lto is used then the middle-end is not trained at all, since we do not produce auto-profile of lto1. Fix is posted at https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689329.html 3) autorpfoiledbootstrap seems to build stage1 with host compiler and within perf it builds stage2 and uses produced profile for stage3. This means that profile is mixing profile of unoptimized binary buildt by host compiler and profile of stage2 compiler that is used to build runtime afterwards. This is wrong; we should always train on compiler build by stage1 compiler and not stage1 compiler itself Workaround is to always build previously built compiler and use -O2 for stage1 flags. 4) there is no debug compare for autorpofiledbootstrap and profiledbootstrap 5) I think --all -c 10000000 to gcc-auto-profile is kind of wrong. I do not see why we need to profile kernel (--all) and also I think count should be prime number. It seems it is relatively low count for size of the train run.