llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-clang Author: Tim Creech (tcreech-intel) <details> <summary>Changes</summary> This change: - Updates the existing Clang User's Manual section on SPGO so that it describes how to use llvm-profgen to perform SPGO on Windows. This is new functionality implemented in #<!-- -->83972. - Fixes a minor typo in the existing llvm-profgen invocation example. - Adds an LLVM release note on this new functionality in llvm-profgen. --- Full diff: https://github.com/llvm/llvm-project/pull/84864.diff 2 Files Affected: - (modified) clang/docs/UsersManual.rst (+47-13) - (modified) llvm/docs/ReleaseNotes.rst (+5) ``````````diff diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 7391e4cf3a9aeb..b83e6deac75e3f 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -2410,20 +2410,39 @@ usual build cycle when using sample profilers for optimization: 1. Build the code with source line table information. You can use all the usual build flags that you always build your application with. The only - requirement is that you add ``-gline-tables-only`` or ``-g`` to the - command line. This is important for the profiler to be able to map - instructions back to source line locations. + requirement is that DWARF debug info including source line information is + generated. This DWARF information is important for the profiler to be able + to map instructions back to source line locations. + + On Linux, ``-g`` or just ``-gline-tables-only`` is sufficient: .. code-block:: console $ clang++ -O2 -gline-tables-only code.cc -o code + While MSVC-style targets default to CodeView debug information, DWARF debug + information is required to generate source-level LLVM profiles. Use + ``-gdwarf`` to include DWARF debug information: + + .. code-block:: console + + $ clang-cl -O2 -gdwarf -gline-tables-only coff-profile.cpp -fuse-ld=lld -link -debug:dwarf + 2. Run the executable under a sampling profiler. The specific profiler you use does not really matter, as long as its output can be converted - into the format that the LLVM optimizer understands. Currently, there - exists a conversion tool for the Linux Perf profiler - (https://perf.wiki.kernel.org/), so these examples assume that you - are using Linux Perf to profile your code. + into the format that the LLVM optimizer understands. + + Two such profilers are the the Linux Perf profiler + (https://perf.wiki.kernel.org/) and Intel's Sampling Enabling Product (SEP), + available as part of `Intel VTune + <https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html>`_. + While Perf is Linux-specific, SEP can be used on Linux, Windows, and FreeBSD. + + The LLVM tool ``llvm-profgen`` can convert output of either Perf or SEP. An + external project, `AutoFDO <https://github.com/google/autofdo>`_, also + provides a ``create_llvm_prof`` tool which supports Linux Perf output. + + When using Perf: .. code-block:: console @@ -2434,11 +2453,19 @@ usual build cycle when using sample profilers for optimization: it provides better call information, which improves the accuracy of the profile data. -3. Convert the collected profile data to LLVM's sample profile format. - This is currently supported via the AutoFDO converter ``create_llvm_prof``. - It is available at https://github.com/google/autofdo. Once built and - installed, you can convert the ``perf.data`` file to LLVM using - the command: + When using SEP: + + .. code-block:: console + + $ sep -start -out code.tb7 -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script brstack -app ./code + + This produces a ``code.perf.data.script`` output which can be used with + ``llvm-profgen``'s ``--perfscript`` input option. + +3. Convert the collected profile data to LLVM's sample profile format. This is + currently supported via the `AutoFDO <https://github.com/google/autofdo>`_ + converter ``create_llvm_prof``. Once built and installed, you can convert + the ``perf.data`` file to LLVM using the command: .. code-block:: console @@ -2454,7 +2481,14 @@ usual build cycle when using sample profilers for optimization: .. code-block:: console - $ llvm-profgen --binary=./code --output=code.prof--perfdata=perf.data + $ llvm-profgen --binary=./code --output=code.prof --perfdata=perf.data + + When using SEP the output is in the textual format corresponding to + ``llvm-profgen --perfscript``. For example: + + .. code-block:: console + + $ llvm-profgen --binary=./code --output=code.prof --perfscript=perf.data.script 4. Build the code again using the collected profile. This step feeds diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst index b34a5f31c5eb0a..c2bbc647bc18e6 100644 --- a/llvm/docs/ReleaseNotes.rst +++ b/llvm/docs/ReleaseNotes.rst @@ -157,6 +157,11 @@ Changes to the LLVM tools ``--set-symbols-visibility`` options for ELF input to change the visibility of symbols. +* llvm-profgen now supports COFF+DWARF binaries. This enables Sample-based PGO + on Windows using Intel VTune's SEP. For details on usage, see the `end-user + documentation for SPGO + <https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers>`_. + Changes to LLDB --------------------------------- `````````` </details> https://github.com/llvm/llvm-project/pull/84864 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits