https://github.com/jhuber6 created 
https://github.com/llvm/llvm-project/pull/132037

Summary:
Offloading tends to have a bound architecture that directly correponds
to the `-mcpu` argument for that embedded job. This is currently handled
by the GPU offloading toolchains, but is ignored by the CPU ones. This
is problematic for languages like SYCL or OpenMP which permit
'offloading' to non-GPU targets. This patch handles this by putting
generic handling in the GCC toolchain for those languages.

I would've made this fully generic but it regressed some HIP sanitizer
tests because their use-case is really weird. This also only goes for
the languages that inherit from 'generic_gcc`. I could've made it in the
base class, but I felt like it wasn't necessary as we only support
offloading based off of this toolchain. In the future if we need it we
can move it around.


>From 181315ce22b47a9b75cd60f584dfbcf359659a11 Mon Sep 17 00:00:00 2001
From: Joseph Huber <hube...@outlook.com>
Date: Wed, 19 Mar 2025 09:12:55 -0500
Subject: [PATCH] [Offload] Handle `BoundArchitecture` for non-GPU targets

Summary:
Offloading tends to have a bound architecture that directly correponds
to the `-mcpu` argument for that embedded job. This is currently handled
by the GPU offloading toolchains, but is ignored by the CPU ones. This
is problematic for languages like SYCL or OpenMP which permit
'offloading' to non-GPU targets. This patch handles this by putting
generic handling in the GCC toolchain for those languages.

I would've made this fully generic but it regressed some HIP sanitizer
tests because their use-case is really weird. This also only goes for
the languages that inherit from 'generic_gcc`. I could've made it in the
base class, but I felt like it wasn't necessary as we only support
offloading based off of this toolchain. In the future if we need it we
can move it around.
---
 clang/lib/Driver/ToolChains/Gnu.cpp | 64 ++++++++++++++---------------
 clang/test/Driver/openmp-offload.c  | 13 ++++++
 2 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index f56eeda3cb5f6..3302ae978ca5a 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -3388,46 +3388,44 @@ Generic_GCC::addLibStdCxxIncludePaths(const 
llvm::opt::ArgList &DriverArgs,
 }
 
 llvm::opt::DerivedArgList *
-Generic_GCC::TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef,
+Generic_GCC::TranslateArgs(const llvm::opt::DerivedArgList &Args,
+                           StringRef BoundArch,
                            Action::OffloadKind DeviceOffloadKind) const {
+  if (DeviceOffloadKind != Action::OFK_SYCL &&
+      DeviceOffloadKind != Action::OFK_OpenMP)
+    return nullptr;
 
-  // If this tool chain is used for an OpenMP offloading device we have to make
-  // sure we always generate a shared library regardless of the commands the
-  // user passed to the host. This is required because the runtime library
-  // is required to load the device image dynamically at run time.
+  DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
+
+  // Filter all the arguments we don't care passing to the offloading
+  // toolchain as they can mess up with the creation of a shared library.
+  const llvm::DenseSet<unsigned> OpenMPFiltered{
+      options::OPT_shared, options::OPT_dynamic, options::OPT_static,
+      options::OPT_fPIE,   options::OPT_fno_PIE, options::OPT_fpie,
+      options::OPT_fno_pie};
+  for (auto *A : Args)
+    if (DeviceOffloadKind != Action::OFK_OpenMP ||
+        !OpenMPFiltered.contains(A->getOption().getID()))
+      DAL->append(A);
+
+  // Request the shared library for CPU offloading. Given that these options
+  // are decided implicitly, they do not refer to any base argument.
+  const OptTable &Opts = getDriver().getOpts();
   if (DeviceOffloadKind == Action::OFK_OpenMP) {
-    DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
-    const OptTable &Opts = getDriver().getOpts();
-
-    // Request the shared library. Given that these options are decided
-    // implicitly, they do not refer to any base argument.
     DAL->AddFlagArg(/*BaseArg=*/nullptr, Opts.getOption(options::OPT_shared));
     DAL->AddFlagArg(/*BaseArg=*/nullptr, Opts.getOption(options::OPT_fPIC));
+  }
 
-    // Filter all the arguments we don't care passing to the offloading
-    // toolchain as they can mess up with the creation of a shared library.
-    for (auto *A : Args) {
-      switch ((options::ID)A->getOption().getID()) {
-      default:
-        DAL->append(A);
-        break;
-      case options::OPT_shared:
-      case options::OPT_dynamic:
-      case options::OPT_static:
-      case options::OPT_fPIC:
-      case options::OPT_fno_PIC:
-      case options::OPT_fpic:
-      case options::OPT_fno_pic:
-      case options::OPT_fPIE:
-      case options::OPT_fno_PIE:
-      case options::OPT_fpie:
-      case options::OPT_fno_pie:
-        break;
-      }
-    }
-    return DAL;
+  // Add the bound architecture to the arguments list if present.
+  if (!BoundArch.empty()) {
+    options::ID Opt =
+        getTriple().isARM() || getTriple().isPPC() || getTriple().isAArch64()
+            ? options::OPT_mcpu_EQ
+            : options::OPT_march_EQ;
+    DAL->eraseArg(Opt);
+    DAL->AddJoinedArg(nullptr, Opts.getOption(Opt), BoundArch);
   }
-  return nullptr;
+  return DAL;
 }
 
 void Generic_ELF::anchor() {}
diff --git a/clang/test/Driver/openmp-offload.c 
b/clang/test/Driver/openmp-offload.c
index 2cf2643af6c15..162ff20a9745a 100644
--- a/clang/test/Driver/openmp-offload.c
+++ b/clang/test/Driver/openmp-offload.c
@@ -218,3 +218,16 @@
 // RUN:   %clang -### --target=powerpc64le-linux -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu \
 // RUN:     -foffload-lto=thin %s 2>&1 | FileCheck 
-check-prefix=CHK-DEVICE-LTO-THIN %s
 // CHK-DEVICE-LTO-THIN: clang-linker-wrapper{{.*}} 
"--device-compiler=powerpc64le-ibm-linux-gnu=-flto=thin"
+
+//
+// Check forwarding architectures to non-GPU targets
+//
+// RUN:   %clang -### --target=aarch64-unknown-linux-gnu -fopenmp=libomp 
-fopenmp-targets=aarch64-unknown-linux-gnu \
+// RUN:     --offload-arch=a64fx %s 2>&1 | FileCheck 
-check-prefix=CHK-CPU-ARCH-A %s
+// CHK-CPU-ARCH-A: "-cc1" "-triple" "aarch64-unknown-linux-gnu" {{.*}} 
"-target-cpu" "generic"
+// CHK-CPU-ARCH-A: "-cc1" "-triple" "aarch64-unknown-linux-gnu" {{.*}} 
"-target-cpu" "a64fx"
+//
+// RUN:   %clang -### --target=x86_64-unknown-linux-gnu -fopenmp=libomp 
-fopenmp-targets=x86_64-unknown-linux-gnu \
+// RUN:     --offload-arch=znver4 %s 2>&1 | FileCheck 
-check-prefix=CHK-CPU-ARCH-X %s
+// CHK-CPU-ARCH-X: "-cc1" "-triple" "x86_64-unknown-linux-gnu" {{.*}} 
"-target-cpu" "x86-64"
+// CHK-CPU-ARCH-X: "-cc1" "-triple" "x86_64-unknown-linux-gnu" {{.*}} 
"-target-cpu" "znver4"

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to