https://github.com/jdoerfert updated
https://github.com/llvm/llvm-project/pull/95371
>From 34c8bf739040b9d3d0bf625cdadf12b282249ccf Mon Sep 17 00:00:00 2001
From: Johannes Doerfert
Date: Fri, 7 Jun 2024 17:06:02 -0700
Subject: [PATCH 1/2] [Offload][CUDA] Add initial cuda_runtime.h overlay
This
https://github.com/jdoerfert updated
https://github.com/llvm/llvm-project/pull/95371
>From d06585044bd6d2dd76d6110bce933e01fd4b333e Mon Sep 17 00:00:00 2001
From: Johannes Doerfert
Date: Mon, 3 Jun 2024 19:52:12 -0700
Subject: [PATCH 1/3] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload
M
https://github.com/jdoerfert updated
https://github.com/llvm/llvm-project/pull/95371
>From d06585044bd6d2dd76d6110bce933e01fd4b333e Mon Sep 17 00:00:00 2001
From: Johannes Doerfert
Date: Mon, 3 Jun 2024 19:52:12 -0700
Subject: [PATCH 1/3] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload
M
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff 7adb7aa494247f2492f6207289ad90cb48807517
705de43498aec79565d6469a00a54e65e988faf8 --
https://github.com/jdoerfert created
https://github.com/llvm/llvm-project/pull/95371
The offload APIs, and the CUDA wrappers in clang, now support "default
streams" per thread (and per device). It should be per context but we
don't really expose that concept yet. The KernelArguments allow an
LLV
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
llvmbot wrote:
@llvm/pr-subscribers-offload
@llvm/pr-subscribers-clang-driver
Author: Johannes Doerfert (jdoerfert)
Changes
The offload APIs, and the CUDA wrappers in clang, now support "default
streams" per thread (and per device). It should be per context but we
don't really expose that