> Hi. > > I'm sending v3 of the patch where I changed: > - function.cold sections are properly put into .text.unlikely and > not into a .text.sorted.XYZ section > > I've just finished measurements and I still have the original speed up > for tramp3d: > Total runs: 10, before: 13.92, after: 13.82, cmp: 99.219%
Hi, I have updated binutils to current head on the Firefox testing patch and run FDO build with tp first run ordering and call chain clustering. https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1313e6a4d74ebff702afa7594684beb83c01d95f&newProject=try&newRevision=1c2d53b10b042aaaac15edbe7bd26e2740641840&framework=1 It seems there are no differences in performance. The two binaries can be downloaded at w/o patch: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/MK-7DC3FQcevZC_Nvlnq8Q/runs/0/artifacts/public/build/target.tar.bz2 with call chain clustering. https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/UVh6iNILT-qb8sYM5vxVCQ/runs/0/artifacts/public/build/target.tar.bz2 Since Firefox is quite sensitive to code size I would expect to be able to measure some benefits here. Any idea what may have go wrong? I checked that the binaries seems generally sane - out of 58MB text segment there is 34MB cold section. It is possible that system ld is used instead of provided one, but that would be weird. I will try to find way to double-check that updating binutils really updated them for GCC. Honza