Author: Kyungwoo Lee Date: 2024-09-15T16:04:42-07:00 New Revision: 713a2029578eb36a29793105948d8e4fe965da18
URL: https://github.com/llvm/llvm-project/commit/713a2029578eb36a29793105948d8e4fe965da18 DIFF: https://github.com/llvm/llvm-project/commit/713a2029578eb36a29793105948d8e4fe965da18.diff LOG: [CGData] Clang Options (#90304) This adds new Clang flags to support codegen (CG) data: - `-fcodegen-data-generate{=path}`: This flag passes `-codegen-data-generate` as a boolean to the LLVM backend, causing the raw CG data to be emitted into a custom section. Currently, for LLD MachO only, it also passes `--codegen-data-generate-path=<path>` so that the indexed CG data file can be automatically produced at link time. For linkers that do not yet support this feature, `llvm-cgdata` can be used manually to merge this CG data in object files. - `-fcodegen-data-use{=path}`: This flag passes `-codegen-data-use-path=<path>` to the LLVM backend, enabling the use of specified CG data to optimistically outline functions. - The default `<path>` is set to `default.cgdata` when not specified. This depends on https://github.com/llvm/llvm-project/pull/108733. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753. Added: clang/test/Driver/codegen-data.c Modified: clang/docs/UsersManual.rst clang/include/clang/Driver/Options.td clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/Darwin.cpp Removed: ################################################################################ diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index f27fa4ace917ea..57d78f867bab6e 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -2410,6 +2410,39 @@ are listed below. link-time optimizations like whole program inter-procedural basic block reordering. +.. option:: -fcodegen-data-generate[=<path>] + + Emit the raw codegen (CG) data into custom sections in the object file. + Currently, this option also combines the raw CG data from the object files + into an indexed CG data file specified by the <path>, for LLD MachO only. + When the <path> is not specified, `default.cgdata` is created. + The CG data file combines all the outlining instances that occurred locally + in each object file. + + .. code-block:: console + + $ clang -fuse-ld=lld -Oz -fcodegen-data-generate code.cc + + For linkers that do not yet support this feature, `llvm-cgdata` can be used + manually to merge this CG data in object files. + + .. code-block:: console + + $ clang -c -fuse-ld=lld -Oz -fcodegen-data-generate code.cc + $ llvm-cgdata --merge -o default.cgdata code.o + +.. option:: -fcodegen-data-use[=<path>] + + Read the codegen data from the specified path to more effectively outline + functions across compilation units. When the <path> is not specified, + `default.cgdata` is used. This option can create many identically outlined + functions that can be optimized by the conventional linker’s identical code + folding (ICF). + + .. code-block:: console + + $ clang -fuse-ld=lld -Oz -Wl,--icf=safe -fcodegen-data-use code.cc + Profile Guided Optimization --------------------------- diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index dc8bfc69e9889b..7f123335ce8cfa 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1894,6 +1894,18 @@ def fprofile_selected_function_group : Visibility<[ClangOption, CC1Option]>, MetaVarName<"<i>">, HelpText<"Partition functions into N groups using -fprofile-function-groups and select only functions in group i to be instrumented. The valid range is 0 to N-1 inclusive">, MarshallingInfoInt<CodeGenOpts<"ProfileSelectedFunctionGroup">>; +def fcodegen_data_generate_EQ : Joined<["-"], "fcodegen-data-generate=">, + Group<f_Group>, Visibility<[ClangOption, CLOption]>, MetaVarName<"<path>">, + HelpText<"Emit codegen data into the object file. LLD for MachO (currently) merges them into the specified <path>.">; +def fcodegen_data_generate : Flag<["-"], "fcodegen-data-generate">, + Group<f_Group>, Visibility<[ClangOption, CLOption]>, Alias<fcodegen_data_generate_EQ>, AliasArgs<["default.cgdata"]>, + HelpText<"Emit codegen data into the object file. LLD for MachO (currently) merges them into default.cgdata.">; +def fcodegen_data_use_EQ : Joined<["-"], "fcodegen-data-use=">, + Group<f_Group>, Visibility<[ClangOption, CLOption]>, MetaVarName<"<path>">, + HelpText<"Use codegen data read from the specified <path>.">; +def fcodegen_data_use : Flag<["-"], "fcodegen-data-use">, + Group<f_Group>, Visibility<[ClangOption, CLOption]>, Alias<fcodegen_data_use_EQ>, AliasArgs<["default.cgdata"]>, + HelpText<"Use codegen data read from default.cgdata to optimize the binary">; def fswift_async_fp_EQ : Joined<["-"], "fswift-async-fp=">, Group<f_Group>, Visibility<[ClangOption, CC1Option, CC1AsOption, CLOption]>, diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index f58b816a9709dd..502aba2ce4aa9c 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -2753,6 +2753,25 @@ void tools::addMachineOutlinerArgs(const Driver &D, addArg(Twine("-enable-machine-outliner=never")); } } + + auto *CodeGenDataGenArg = + Args.getLastArg(options::OPT_fcodegen_data_generate_EQ); + auto *CodeGenDataUseArg = Args.getLastArg(options::OPT_fcodegen_data_use_EQ); + + // We only allow one of them to be specified. + if (CodeGenDataGenArg && CodeGenDataUseArg) + D.Diag(diag::err_drv_argument_not_allowed_with) + << CodeGenDataGenArg->getAsString(Args) + << CodeGenDataUseArg->getAsString(Args); + + // For codegen data gen, the output file is passed to the linker + // while a boolean flag is passed to the LLVM backend. + if (CodeGenDataGenArg) + addArg(Twine("-codegen-data-generate")); + + // For codegen data use, the input file is passed to the LLVM backend. + if (CodeGenDataUseArg) + addArg(Twine("-codegen-data-use-path=") + CodeGenDataUseArg->getValue()); } void tools::addOpenMPDeviceRTL(const Driver &D, diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp index 5e7f9290e2009d..ebc9ed1aadb0ab 100644 --- a/clang/lib/Driver/ToolChains/Darwin.cpp +++ b/clang/lib/Driver/ToolChains/Darwin.cpp @@ -476,6 +476,13 @@ void darwin::Linker::AddLinkArgs(Compilation &C, const ArgList &Args, llvm::sys::path::append(Path, "default.profdata"); CmdArgs.push_back(Args.MakeArgString(Twine("--cs-profile-path=") + Path)); } + + auto *CodeGenDataGenArg = + Args.getLastArg(options::OPT_fcodegen_data_generate_EQ); + if (CodeGenDataGenArg) + CmdArgs.push_back( + Args.MakeArgString(Twine("--codegen-data-generate-path=") + + CodeGenDataGenArg->getValue())); } } @@ -633,6 +640,32 @@ void darwin::Linker::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back("-mllvm"); CmdArgs.push_back("-enable-linkonceodr-outlining"); + // Propagate codegen data flags to the linker for the LLVM backend. + auto *CodeGenDataGenArg = + Args.getLastArg(options::OPT_fcodegen_data_generate_EQ); + auto *CodeGenDataUseArg = Args.getLastArg(options::OPT_fcodegen_data_use_EQ); + + // We only allow one of them to be specified. + const Driver &D = getToolChain().getDriver(); + if (CodeGenDataGenArg && CodeGenDataUseArg) + D.Diag(diag::err_drv_argument_not_allowed_with) + << CodeGenDataGenArg->getAsString(Args) + << CodeGenDataUseArg->getAsString(Args); + + // For codegen data gen, the output file is passed to the linker + // while a boolean flag is passed to the LLVM backend. + if (CodeGenDataGenArg) { + CmdArgs.push_back("-mllvm"); + CmdArgs.push_back("-codegen-data-generate"); + } + + // For codegen data use, the input file is passed to the LLVM backend. + if (CodeGenDataUseArg) { + CmdArgs.push_back("-mllvm"); + CmdArgs.push_back(Args.MakeArgString(Twine("-codegen-data-use-path=") + + CodeGenDataUseArg->getValue())); + } + // Setup statistics file output. SmallString<128> StatsFile = getStatsFileName(Args, Output, Inputs[0], getToolChain().getDriver()); diff --git a/clang/test/Driver/codegen-data.c b/clang/test/Driver/codegen-data.c new file mode 100644 index 00000000000000..28638f61d641c5 --- /dev/null +++ b/clang/test/Driver/codegen-data.c @@ -0,0 +1,38 @@ +// Verify only one of codegen-data flag is passed. +// RUN: not %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-generate -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=CONFLICT +// RUN: not %clang -### -S --target=arm64-apple-darwin -fcodegen-data-generate -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=CONFLICT +// CONFLICT: error: invalid argument '-fcodegen-data-generate' not allowed with '-fcodegen-data-use' + +// Verify the codegen-data-generate (boolean) flag is passed to LLVM +// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-generate %s 2>&1| FileCheck %s --check-prefix=GENERATE +// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1| FileCheck %s --check-prefix=GENERATE +// GENERATE: "-mllvm" "-codegen-data-generate" + +// Verify the codegen-data-use-path flag (with a default value) is passed to LLVM. +// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-use %s 2>&1| FileCheck %s --check-prefix=USE +// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-use %s 2>&1| FileCheck %s --check-prefix=USE +// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-use=file %s 2>&1 | FileCheck %s --check-prefix=USE-FILE +// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-use=file %s 2>&1 | FileCheck %s --check-prefix=USE-FILE +// USE: "-mllvm" "-codegen-data-use-path=default.cgdata" +// USE-FILE: "-mllvm" "-codegen-data-use-path=file" + +// Verify the codegen-data-generate (boolean) flag with a LTO. +// RUN: %clang -### -flto --target=aarch64-linux-gnu -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LTO +// GENERATE-LTO: {{ld(.exe)?"}} +// GENERATE-LTO-SAME: "-plugin-opt=-codegen-data-generate" +// RUN: %clang -### -flto --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LTO-DARWIN +// GENERATE-LTO-DARWIN: {{ld(.exe)?"}} +// GENERATE-LTO-DARWIN-SAME: "-mllvm" "-codegen-data-generate" + +// Verify the codegen-data-use-path flag with a LTO is passed to LLVM. +// RUN: %clang -### -flto=thin --target=aarch64-linux-gnu -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=USE-LTO +// USE-LTO: {{ld(.exe)?"}} +// USE-LTO-SAME: "-plugin-opt=-codegen-data-use-path=default.cgdata" +// RUN: %clang -### -flto=thin --target=arm64-apple-darwin -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=USE-LTO-DARWIN +// USE-LTO-DARWIN: {{ld(.exe)?"}} +// USE-LTO-DARWIN-SAME: "-mllvm" "-codegen-data-use-path=default.cgdata" + +// For now, LLD MachO supports for generating the codegen data at link time. +// RUN: %clang -### -fuse-ld=lld -B%S/Inputs/lld --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LLD-DARWIN +// GENERATE-LLD-DARWIN: {{ld(.exe)?"}} +// GENERATE-LLD-DARWIN-SAME: "--codegen-data-generate-path=default.cgdata" _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits