[Lldb-commits] [flang] [clang] [lldb] [libc] [compiler-rt] [clang-tools-extra] [lld] [llvm] [libcxx] [openmp] Gcc 75 libomptarget type convert (PR #75562)

2023-12-15 Thread Joseph Huber via lldb-commits


@@ -47,7 +47,9 @@ PluginAdaptorTy::create(const std::string &Name) {
   new PluginAdaptorTy(Name, std::move(LibraryHandler)));
   if (auto Err = PluginAdaptor->init())
 return Err;
-  return PluginAdaptor;

jhuber6 wrote:

Does putting `std::move` here not work?

https://github.com/llvm/llvm-project/pull/75562
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [libc] [clang-tools-extra] [compiler-rt] [lldb] [llvm] [flang] [libcxx] [openmp] [clang] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();

jhuber6 wrote:

Made a patch in https://github.com/llvm/llvm-project/pull/76992 and 
https://github.com/llvm/llvm-project/pull/76970 to make this actually work.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [clang] [lld] [clang-tools-extra] [compiler-rt] [flang] [lldb] [libcxx] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');

jhuber6 wrote:

Why are we turning this into a vector of chars? Also isn't `sizeof(char)` 
pretty much always going to be `1`?

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [lldb] [openmp] [llvm] [clang-tools-extra] [lld] [flang] [clang] [libcxx] [libc] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());
+if (auto Err = readGlobalFromDevice(Device, Image, NamesGlobal))
+  return Err;
+std::string names(chars.begin(), chars.end());
+profdata.names = std::move(names);
+  } else if (name->starts_with(getInstrProfCountersVarPrefix())) {
+// Read global variable profiling counts
+std::vector counts(sym.getSize() / sizeof(int64_t), 0);
+GlobalTy CountGlobal(name->str(), sym.getSize(), counts.data());
+if (auto Err = readGlobalFromDevice(Device, Image, CountGlobal))
+  return Err;
+profdata.counts.push_back(std::move(counts));
+  } else if (name->starts_with(getInstrProfDataVarPrefix())) {
+// Read profiling data for this global variable
+__llvm_profile_data data{};
+GlobalTy DataGlobal(name->str(), sym.getSize(), &data);
+if (auto Err = readGlobalFromDevice(Device, Image, DataGlobal))
+  return Err;
+profdata.data.push_back(std::move(data));
+  }
+}
+  }
+  return profdata;
+}

jhuber6 wrote:

LLVM style for everything here.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [clang] [clang-tools-extra] [flang] [llvm] [libcxx] [lld] [lldb] [libc] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;

jhuber6 wrote:

What's the utility of this?

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [clang-tools-extra] [openmp] [flang] [libc] [libcxx] [llvm] [lldb] [compiler-rt] [clang] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

These should probably use LLVM structs. E.g. `StringRef` is the name is a 
constant string with stable storage and `SmallVector`.

I'd really appreciate some descriptions of how this is supposed to look and how 
it interacts with the existing profile data.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [libcxx] [clang-tools-extra] [compiler-rt] [clang] [flang] [llvm] [libc] [openmp] [lldb] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;

jhuber6 wrote:

```suggestion
  GPUProfGlobals ProfData;
```
LLVM style. Also not a fan of the name. Maybe `DeviceProfileData` or something.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [openmp] [compiler-rt] [libcxx] [clang-tools-extra] [lld] [llvm] [clang] [flang] [lldb] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());
+if (auto Err = readGlobalFromDevice(Device, Image, NamesGlobal))
+  return Err;
+std::string names(chars.begin(), chars.end());
+profdata.names = std::move(names);
+  } else if (name->starts_with(getInstrProfCountersVarPrefix())) {

jhuber6 wrote:

Are the `getInstrProfCountersVarPrefix` function preexisting? I don't see them 
defined in this patch set.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [lldb] [clang-tools-extra] [compiler-rt] [flang] [llvm] [clang] [libcxx] [openmp] [libc] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());

jhuber6 wrote:

Okay, we're reading a string back from the device? What's the purpose of that? 
Also, just so you know, the ELF will only contain the correct size if it's 
emitted as an array. E.g.
```
const char a[] = "a"; // strlen("a") + 1 in ELF
const char *b = "b"; // sizeof(char *) in ELF
```

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [openmp] [lld] [clang-tools-extra] [libcxx] [llvm] [flang] [libc] [clang] [lldb] [compiler-rt] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {

jhuber6 wrote:

This is incorrect. If this returns an error it will exit the if, call the 
deconstructor, and then crash the program because it was not handled.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [libc] [lldb] [openmp] [clang] [llvm] [flang] [compiler-rt] [libcxx] [lld] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits


@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

All of them, SmallVector is a std::vector with small size optimizations like 
`std::string` basically.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [libc] [lldb] [openmp] [clang] [llvm] [flang] [compiler-rt] [libcxx] [lld] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits


@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;

jhuber6 wrote:

Okay. you should use the C++ `using` keyword instead of C's `typedef.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libc] [lld] [lldb] [clang-tools-extra] [llvm] [compiler-rt] [flang] [libcxx] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libc] [lld] [lldb] [clang-tools-extra] [llvm] [compiler-rt] [flang] [libcxx] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits


@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());

jhuber6 wrote:

Okay, this should use `SmallVector` as well, don't bother dividing by the size 
because the one reported from the ELF is absolute.  Then just make the data 
inside `uint8_t`.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [flang] [libcxx] [lld] [compiler-rt] [lldb] [clang] [llvm] [libc] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-08 Thread Joseph Huber via lldb-commits


@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

That's confusing, how would using a `std::vector` not have that problem as 
well? I'll need to look into that.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libc] [lldb] [llvm] [mlir] [compiler-rt] [lld] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium %s
+// RUN: %{check} -fbasic-block-sections=all

jhuber6 wrote:

Offloading compilation for these single-source languages pretty much just 
combines one "host" compilation job with N "Device" compilation jobs. Doing 
`--offload-device-only` and `--offload-host-only` simply does one part of that. 
There's probably some flags that behave differently depending on which end 
you're compiling on, so maybe it would be useful for separating that behavior 
if needed.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [lldb] [lld] [compiler-rt] [clang] [mlir] [libc] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium %s
+// RUN: %{check} -fbasic-block-sections=all
+
+// REDEFINE: %{gpu_opts} = -x hip --rocm-path=%S/Inputs/rocm -nogpulib

jhuber6 wrote:

Should probably include `-nogpuinc` as well. Best way to avoid spurious 
failures due to lack of a local CUDA / ROCm installation. Maybe in the future 
LLVM based offloading won't depend on so much external stuff.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [clang] [openmp] [lld] [clang-tools-extra] [lldb] [libcxx] [compiler-rt] [mlir] [llvm] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 approved this pull request.


https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [pstl] [llvm] [mlir] [libc] [compiler-rt] [libcxx] [openmp] [clang-tools-extra] [clang] [lld] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,5 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{check} = %clang -### -c -mcmodel=medium

jhuber6 wrote:

Probably depends on the option we're testing. We could do both.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [clang] [openmp] [compiler-rt] [lld] [llvm] [libc] [libcxx] [clang-tools-extra] [mlir] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

Maybe need to specify `--target=x86_64-unknown-linux-gnu` in the test?

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [flang] [libcxx] [clang] [llvm] [clang-tools-extra] [lldb] [lld] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/79373

>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX

Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to the
standalone toolchain.
---
 clang/lib/Driver/ToolChains/Cuda.cpp   | 61 +-
 clang/lib/Driver/ToolChains/Cuda.h | 10 ++--
 clang/test/Driver/nvptx-cuda-system-arch.c |  5 ++
 3 files changed, 45 insertions(+), 31 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index 1462576ca870e6f..6215c43b5fc96bd 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -738,9 +738,18 @@ NVPTXToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
 if (!llvm::is_contained(*DAL, A))
   DAL->append(A);
 
-  if (!DAL->hasArg(options::OPT_march_EQ))
+  if (!DAL->hasArg(options::OPT_march_EQ)) {
 DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
   CudaArchToString(CudaArch::CudaDefault));
+  } else if (DAL->getLastArgValue(options::OPT_march_EQ) == "native") {
+auto GPUsOrErr = getSystemGPUArchs(Args);
+if (!GPUsOrErr)
+  getDriver().Diag(diag::err_drv_undetermined_gpu_arch)
+  << getArchName() << llvm::toString(GPUsOrErr.takeError()) << 
"-march";
+else
+  DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
+Args.MakeArgString(GPUsOrErr->front()));
+  }
 
   return DAL;
 }
@@ -783,6 +792,31 @@ void NVPTXToolChain::adjustDebugInfoKind(
   }
 }
 
+Expected>
+NVPTXToolChain::getSystemGPUArchs(const ArgList &Args) const {
+  // Detect NVIDIA GPUs availible on the system.
+  std::string Program;
+  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
+Program = A->getValue();
+  else
+Program = GetProgramPath("nvptx-arch");
+
+  auto StdoutOrErr = executeToolChainProgram(Program);
+  if (!StdoutOrErr)
+return StdoutOrErr.takeError();
+
+  SmallVector GPUArchs;
+  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
+if (!Arch.empty())
+  GPUArchs.push_back(Arch.str());
+
+  if (GPUArchs.empty())
+return llvm::createStringError(std::error_code(),
+   "No NVIDIA GPU detected in the system");
+
+  return std::move(GPUArchs);
+}
+
 /// CUDA toolchain.  Our assembler is ptxas, and our "linker" is fatbinary,
 /// which isn't properly a linker but nonetheless performs the step of 
stitching
 /// together object files from the assembler into a single blob.
@@ -948,31 +982,6 @@ CudaToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
   return DAL;
 }
 
-Expected>
-CudaToolChain::getSystemGPUArchs(const ArgList &Args) const {
-  // Detect NVIDIA GPUs availible on the system.
-  std::string Program;
-  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
-Program = A->getValue();
-  else
-Program = GetProgramPath("nvptx-arch");
-
-  auto StdoutOrErr = executeToolChainProgram(Program);
-  if (!StdoutOrErr)
-return StdoutOrErr.takeError();
-
-  SmallVector GPUArchs;
-  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
-if (!Arch.empty())
-  GPUArchs.push_back(Arch.str());
-
-  if (GPUArchs.empty())
-return llvm::createStringError(std::error_code(),
-   "No NVIDIA GPU detected in the system");
-
-  return std::move(GPUArchs);
-}
-
 Tool *NVPTXToolChain::buildAssembler() const {
   return new tools::NVPTX::Assembler(*this);
 }
diff --git a/clang/lib/Driver/ToolChains/Cuda.h 
b/clang/lib/Driver/ToolChains/Cuda.h
index 8a053f3393e1206..43c17ba7c0ba03d 100644
--- a/clang/lib/Driver/ToolChains/Cuda.h
+++ b/clang/lib/Driver/ToolChains/Cuda.h
@@ -168,6 +168,11 @@ class LLVM_LIBRARY_VISIBILITY NVPTXToolChain : public 
ToolChain {
   unsigned GetDefaultDwarfVersion() const override { return 2; }
   unsigned getMaxDwarfVersion() const override { return 2; }
 
+  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
+  /// if unable to find one.
+  virtual Expected>
+  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
+
   CudaInstallationDetector CudaInstallation;
 
 protected:
@@ -223,11 +228,6 @@ class LLVM_LIBRARY_VISIBILITY CudaToolChain : public 
NVPTXToolChain {
 
   const ToolChain &HostTC;
 
-  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
-  /// if unable to find one.
-  virtual Expected>
-  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
-
 protected:
   Tool *buildAssembler() const override; // ptxas
   Tool *buildLinker() const override;  

[Lldb-commits] [lld] [lldb] [llvm] [compiler-rt] [clang-tools-extra] [libc] [clang] [flang] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> On the other hand, I'd be OK with providing --offload-arch=native translating 
> into "compile for all present GPU variants", with a possibility to further 
> adjust the selected set with the usual --no-offload-arch-foo, if the user 
> needs to. This will at least produce code that will run on the machine where 
> it's built, be somewhat consistent and is still adjustable by the user when 
> the default choice will inevitably be wrong.

This is what we already do, but this is somewhat tangential. I've updated this 
patch to present the warning in the case of multiply GPUs being detected, so I 
don't think there's a concern here with the user being confused. If they have 
two GPUs, the warning will tell them which one it's using with the correct 
`sm_` value to specify it manually if they so wish. If there is only one GPU on 
the system, it should be obvious that it's going to be targeted.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> User confusion is only part of the issue here. With any single GPU choice we 
> would still potentially produce a nonworking binary, if our GPU choice does 
> not match what the user wants.
>
> "all GPUs" has the advantage of always producing the binary that's guaranteed 
> to work. Granted, in the case of multiple GPUs it comes with the compilation 
> time overhead, but I think it's a better trade-off than compiling faster, but 
> not working. If the overhead is unacceptable, then we can tweak the build, 
> but in that case, the user may as well just specify the desired architectures 
> explicitly.

I think the semantics of `native` on other architectures are clear enough here. 
This combined with the fact that using `-march=native` will error out in the 
case of no GPUs available, or give a warning if more than one GPU is available, 
should be sufficiently clear what it's doing. This obviously falls apart if you 
compile with `-march=native` and then move it off of the system you compiled it 
for, but the same applies for standard x64 binaries I feel.

Realistically, very, very few casual users are going to be using direct NVPTX 
targeting. The current use-case is for building tests directly for the GPU 
without needing to handle calling `amdgpu-arch` and `nvptx-arch` manually in 
CMake. If I had this in, then I could simplify a lot of CMake code in my `libc` 
project by just letting the compiler handle the autodetection. Then one less 
random program dependency is removed from the build process. AMDGPU already has 
`-mcpu=native` so I'd like NVPTX to match if possible.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > I think the semantics of native on other architectures are clear enough 
> > here.
> 
> I don't think we have the same idea about that. Let's spell it out, so 
> there's no confusion.
> 
> [GCC 
> manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) 
> says:
> 
> > Using -march=native enables all instruction subsets supported by the local 
> > machine (hence the result might not run on different machines)
> 
> The way I read it "all instruction subsets supported by the local machine" 
> would be what all-GPUs strategy would do. The binary is expected to run on 
> all GPU architecture variants available on the machine.
> 
> Granted, gcc was not written with GPUs in mind, but it's a good baseline for 
> establishing existing conventions for the meaning of `-march=native`.

This more or less depends on what your definition of "local machine" is when it 
comes to a system augmented with GPUs. The verbiage of "**The** local machine" 
implies an assumption that there is only one, which I personally find 
consistent with just selecting the first GPU found on the system. There is 
ambiguity in how we should treat this in the case of multiple GPUs, but that's 
what the warning message is for. it informs the user that the "native" 
architecture is somewhat ambiguous and that the first one was selected.

Further, our current default makes sense, because it corresponds to Device ID 
zero in CUDA, which means that unless you change the environment via 
`CUDA_VISIBLE_DEVICES` or something, it will work on the default device.

So, in the case there is one device, the behavior is consistent with 
`-march=native`. In the case where there are two, we make an implicit decision 
to target the first GPU and inform the user. This method of compilation is not 
like CUDA, so we can't target all the GPUs at the same time. This will be 
useful in cases where we want to write code that simply targets a GPU that will 
"work". We have CMake code around LLVM already to do this, so it would be nice 
to get rid of that.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the 
> > GPUs at the same time.
> 
> I think this is the key fact I was missing. If the patch is only for a 
> standalone compilation which does not do multi-GPU compilation in principle, 
> then your approach makes sense.
> 
> I was arguing from the normal offloading which does have ability to target 
> multiple GPUs.

Yes, this is more similar to OpenCL or just regular CPU compilation where we 
have a single job that creates a simple executable, terminal application style. 
So given a single target, the desire is to "pick me the one that will work on 
the default CUDA device without me needing to check." type thing.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the 
> > GPUs at the same time.
> 
> Can you clarify for me -- what are you compiling where it's impossible to 
> target multiple GPUs in the binary? I'm confused because Art is understanding 
> that it's not CUDA, but we're modifying the CUDA driver here?

The idea is to simply compile C / C++ code directly targeting NVPTX rather than 
going through offloading languages like CUDA or OpenMP. This is more or less 
what cross-compiling is. We specify `--target=nvptx64-nvidia-cuda` which 
instructs the compiler to cross-compile the C / C++ targeting NVPTX. This 
results in a workflow that is very close to compiling a standard executable by 
design. This is mostly related to my work on the LLVM C library for GPUs [which 
I did a talk on that goes in more 
detail](https://www.youtube.com/watch?v=_LLGc48GYHc)

Right now, with the LLVM `libc` infrastructure I can do the following on my AMD 
GPU.

```
#include 
int main() { puts("Hello World!"); }
```
And compile it and run it more or less.
```
$ clang hello.c --target=amdgcn-amd-amdhsa -mcpu=native -flto -lc crt1.o
$ amdhsa_loader a.out
Hello World!
```
This works with AMD currently, and I want it to work for NVPTX so I can remove 
some ugly, annoying code in the `libc` project. This is how I'm running unit 
tests targeting the GPU in that project, which needs to run on the user's GPU. 
I'd rather just use `-march=native` than detect it manually in CMake.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> I...think I understand.
> 
> Is the output of this compilation step a cubin, then?

Yes, it will spit out a simple `cubin` instead of a fatbinary. The NVIDIA 
toolchain is much worse about this stuff than the AMD one, but in general it 
works. You can check with `-###` or whatever like in 
https://godbolt.org/z/zWf5jezYP.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [clang] [libc] [compiler-rt] [clang-tools-extra] [llvm] [lld] [lldb] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> Got it, okay, thanks.
> 
> Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me. 
> Thanks for putting up with our scrutiny. :)

No problem, I probably should've have been  clearer in my commit messages.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [llvm] [libc] [clang] [libcxx] [lldb] [lld] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 closed 
https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,17 @@
+//===-- Implementation of the GPU lroundf function

jhuber6 wrote:

This and some other headers need to be reformatted.

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,21 @@
+//===-- Implementation of the GPU atan2f function

jhuber6 wrote:

Fix header

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 approved this pull request.

LG after fixing the header formatting.

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits


@@ -0,0 +1,19 @@
+//===-- Implementation of the GPU logbf function

jhuber6 wrote:

Fix header

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> So I should do like open 5 branches and make a PR for each?
> 
> As for the cases mentioned, when I tried compile all projects (compile with 
> -DLLVM_ENABLE_PROJECTS=all), I did encounter such cases when compiling and 
> I've fixed them all. Now all projects can be successfully built, so I think 
> it should be fine?

Yes, one PR for each project please.

Also refer to the failing CI for what's broken, e.g.
```
/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-96x6s-1/llvm-project/github-pull-requests/flang/lib/Lower/OpenMP/Utils.cpp:30:28:
 error: static declaration of 'treatIndexAsSection' follows non-static 
declaration
static llvm::cl::opt treatIndexAsSection(
   ^
/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-96x6s-1/llvm-project/github-pull-requests/flang/lib/Lower/OpenMP/Utils.h:19:28:
 note: previous declaration is here
extern llvm::cl::opt treatIndexAsSection;
   ^
```

https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 commented:

This should definitely be split up. Also some options are referenced in 
multiple places, i.e.
```
// foo.h
extern cl::opt <...>

// foo.cpp
cl::opt <...>
use opt

//bar.cpp
use opt.
```

https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [compiler-rt] [flang] [libc] [libcxx] [lldb] [llvm] [mlir] [openmp] Fix typos and spelling errors across codebase (PR #156270)

2025-09-02 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 approved this pull request.

libc and openmp LG

https://github.com/llvm/llvm-project/pull/156270
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] [ELF][LLDB] Add an nvsass triple (PR #159459)

2025-09-20 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> Interesting. Thanks for the feedback and thanks for reverting this change.

If you can make a PR for just the lldb stuff I can review it. I appreciate 
NVIDIA trying to upstream more support, so I'm not trying to dissuade you.

https://github.com/llvm/llvm-project/pull/159459
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] Revert "[ELF][LLDB] Add an nvsass triple (#159459)" (PR #159879)

2025-09-20 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 closed 
https://github.com/llvm/llvm-project/pull/159879
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] [ELF][LLDB] Add an nvsass triple (PR #159459)

2025-09-20 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

I'm assuming the `lldb` changes are fine but should probably just use 
`nvptx64`. We could probably entirely remove `nvptx32` since NVIDIA's removed 
it AFAIK.

https://github.com/llvm/llvm-project/pull/159459
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] Revert "[ELF][LLDB] Add an nvsass triple (#159459)" (PR #159879)

2025-09-20 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 created 
https://github.com/llvm/llvm-project/pull/159879

Summary:
This patch has broken the `libc` build bot. I could work around that but
the changes seem unnecessary.

This reverts commit 9ba844eb3a21d461c3adc7add7691a076c6992fc.


>From f0829fd54d9b9b90a73a4229fbca77d17443d4a2 Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Fri, 19 Sep 2025 18:44:31 -0500
Subject: [PATCH] Revert "[ELF][LLDB] Add an nvsass triple (#159459)"

Summary:
This patch has broken the `libc` build bot. I could work around that but
the changes seem unnecessary.

This reverts commit 9ba844eb3a21d461c3adc7add7691a076c6992fc.
---
 lldb/include/lldb/Utility/ArchSpec.h|  6 +-
 lldb/source/Utility/ArchSpec.cpp|  4 
 llvm/include/llvm/Object/ELFObjectFile.h|  6 --
 llvm/include/llvm/TargetParser/Triple.h |  7 +--
 llvm/lib/Object/ELFObjectFile.cpp   |  4 ++--
 llvm/lib/Object/ObjectFile.cpp  |  2 +-
 llvm/lib/TargetParser/TargetDataLayout.cpp  |  1 -
 llvm/lib/TargetParser/Triple.cpp| 15 ---
 llvm/unittests/Object/ELFObjectFileTest.cpp |  2 +-
 9 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/lldb/include/lldb/Utility/ArchSpec.h 
b/lldb/include/lldb/Utility/ArchSpec.h
index 8415eca96ea69..361108fd8f0e7 100644
--- a/lldb/include/lldb/Utility/ArchSpec.h
+++ b/lldb/include/lldb/Utility/ArchSpec.h
@@ -236,8 +236,6 @@ class ArchSpec {
 
 eCore_wasm32,
 
-eCore_nvsass,
-
 kNumCores,
 
 kCore_invalid,
@@ -284,10 +282,8 @@ class ArchSpec {
 kCore_mips64el_last = eCore_mips64r6el,
 
 kCore_mips_first = eCore_mips32,
-kCore_mips_last = eCore_mips64r6el,
+kCore_mips_last = eCore_mips64r6el
 
-kCore_nvsass_first = eCore_nvsass,
-kCore_nvsass_last = eCore_nvsass,
   };
 
   /// Default constructor.
diff --git a/lldb/source/Utility/ArchSpec.cpp b/lldb/source/Utility/ArchSpec.cpp
index 062e74af7d7aa..dfe4351f0c45b 100644
--- a/lldb/source/Utility/ArchSpec.cpp
+++ b/lldb/source/Utility/ArchSpec.cpp
@@ -248,9 +248,6 @@ static constexpr const CoreDefinition g_core_definitions[] 
= {
 
 {eByteOrderLittle, 4, 1, 4, llvm::Triple::wasm32, ArchSpec::eCore_wasm32,
  "wasm32"},
-
-{eByteOrderLittle, 8, 4, 4, llvm::Triple::nvsass, ArchSpec::eCore_nvsass,
- "nvsass"},
 };
 
 // Ensure that we have an entry in the g_core_definitions for each core. If you
@@ -415,7 +412,6 @@ static const ArchDefinitionEntry g_elf_arch_entries[] = {
 {ArchSpec::eCore_riscv64, llvm::ELF::EM_RISCV,  
ArchSpec::eRISCVSubType_riscv64}, // riscv64
 {ArchSpec::eCore_loongarch32, llvm::ELF::EM_LOONGARCH,  
ArchSpec::eLoongArchSubType_loongarch32}, // loongarch32
 {ArchSpec::eCore_loongarch64, llvm::ELF::EM_LOONGARCH,  
ArchSpec::eLoongArchSubType_loongarch64}, // loongarch64
-{ArchSpec::eCore_nvsass,  llvm::ELF::EM_CUDA,   }, // nvsass
 };
 // clang-format on
 
diff --git a/llvm/include/llvm/Object/ELFObjectFile.h 
b/llvm/include/llvm/Object/ELFObjectFile.h
index 7f8f4a2a01fe4..ced1afdd4cc6a 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -69,7 +69,7 @@ class LLVM_ABI ELFObjectFileBase : public ObjectFile {
   SubtargetFeatures getLoongArchFeatures() const;
 
   StringRef getAMDGPUCPUName() const;
-  StringRef getCUDACPUName() const;
+  StringRef getNVPTXCPUName() const;
 
 protected:
   ELFObjectFileBase(unsigned int Type, MemoryBufferRef Source);
@@ -1431,7 +1431,9 @@ template  Triple::ArchType 
ELFObjectFile::getArch() const {
   }
 
   case ELF::EM_CUDA: {
-return Triple::nvsass;
+if (EF.getHeader().e_ident[ELF::EI_CLASS] == ELF::ELFCLASS32)
+  return Triple::nvptx;
+return Triple::nvptx64;
   }
 
   case ELF::EM_BPF:
diff --git a/llvm/include/llvm/TargetParser/Triple.h 
b/llvm/include/llvm/TargetParser/Triple.h
index 3b9f55ef9f6ea..ed2e01ccb1e26 100644
--- a/llvm/include/llvm/TargetParser/Triple.h
+++ b/llvm/include/llvm/TargetParser/Triple.h
@@ -110,7 +110,6 @@ class Triple {
 renderscript32, // 32-bit RenderScript
 renderscript64, // 64-bit RenderScript
 ve, // NEC SX-Aurora Vector Engine
-nvsass, // NVIDIA SASS
 LastArchType = ve
   };
   enum SubArchType {
@@ -906,8 +905,6 @@ class Triple {
 
   bool isAMDGPU() const { return getArch() == Triple::r600 || isAMDGCN(); }
 
-  bool isNVSASS() const { return getArch() == Triple::nvsass; }
-
   /// Tests whether the target is Thumb (little and big endian).
   bool isThumb() const {
 return getArch() == Triple::thumb || getArch() == Triple::thumbeb;
@@ -1276,9 +1273,7 @@ class Triple {
   LLVM_ABI bool isCompatibleWith(const Triple &Other) const;
 
   /// Test whether the target triple is for a GPU.
-  bool isGPU() const {
-return isSPIRV() || isNVPTX() || isAMDGPU() || isNVSASS();
-  }
+  bool isGPU() const { return isSPIRV() || isNVPTX() || isAMDGPU(); }
 
   /// Merge target triple

[Lldb-commits] [lldb] [llvm] [ELF][LLDB] Add an nvsass triple (PR #159459)

2025-09-20 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> @jhuber6 , something I care about for LLDB is being able to use different 
> disassemblers for ptx and sass. The unique situation here is that sass is not 
> part of LLVM code generation, and it seems that's why LLVM is unaware of 
> sass. LLVM can only generate ptx and folks rely on an nvidia proprietary 
> compiler to go from ptx to sass. But at runtime, LLDB sees both sass and 
> nvptx.
> 
> I guess that a simpler patch that would exist only within LLDB is to add two 
> flavors to the nvptx architecture, one for nvptx and for sass in the ArchSpec 
> class in LLDB. I for sure can do that without doing major architectural 
> changes like this.

LLVM's backend can only emit PTX, but as a toolchain we simply export that 
handling to `ptxas` as our assembler. The toolchain handles the binaries as the 
`nvptx64` triple because they correspond to ELF files created with 
`--target=nvptx64-nvidia-cuda`. You can generate SASS yourself with something 
similar to https://godbolt.org/z/ofWE6baPT. This is as far as I understand it 
at least, because as far as I'm aware, if a file has ELF magic and the 
`EM_CUDA` machine, it is SASS.

https://github.com/llvm/llvm-project/pull/159459
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] [ELF][LLDB] Add an nvsass triple (PR #159459)

2025-09-19 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/159459
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits