@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
IsImplicit, Mapper, VarRef, ForDeviceAddr);
};
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/72697
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
IsImplicit, Mapper, VarRef, ForDeviceAddr);
};
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
@@ -88,7 +88,7 @@ class TargetOptions {
COV_5 = 500,
};
/// \brief Code object version for AMDGPU.
- CodeObjectVersionKind CodeObjectVersion = CodeObjectVersionKind::COV_None;
+ CodeObjectVersionKind CodeObjectVersion = CodeObjectVersionKind::COV_5;
j
jhuber6 wrote:
> We're already assigning names to the different scopes; we're just doing it in
> the `__MEMORY_SCOPE_*` preprocessor macros. Replacing `_MEMORY_SCOPE_SYSTEM`
> in the code with `"system"` doesn't really change the nature of how we assign
> names to scopes.
>
> If these are sup
https://github.com/jhuber6 approved this pull request.
Assuming all the tests pass, this LG and has been a long time coming. Thanks
for working on this.
https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llv
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/73030
Summary:
We support standalone compilation for the NVPTX architecture using
'nvlink' as our linker. Because of the special handling required to
transform input files to cubins, as nvlink expects for some reason, w
jhuber6 wrote:
I have a review up to change the issue I was observing in CMake when building
the `libc` project https://github.com/llvm/llvm-project/pull/73030. That is
required for this to work when compiling the test suite.
https://github.com/llvm/llvm-project/pull/73030
@@ -75,8 +75,8 @@ bb.2:
store volatile i32 0, ptr addrspace(1) undef
ret void
}
-; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 4112
-; DEFAULTSIZE: ; ScratchSize: 4112
+; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 16
jhuber6 wrote:
My understandin
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/73030
>From ee43e8f9ae90bcd70d46b17cfecb854711a4b1ce Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 21 Nov 2023 13:45:10 -0600
Subject: [PATCH] [Clang][NVPTX] Allow passing arguments to the linker while
standa
jhuber6 wrote:
> Missing change to clang/docs/LanguageExtensions.rst describing the new
> builtins.
>
Will do.
> Are there any other projects that we might want to coordinate with here? gcc,
> maybe?
Unknown, I've never collaborated with anyone outside of LLVM. I know they have
handling of G
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/72280
>From b244d36e78cf3e496a41369855e294a6e5765c6d Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 6 Nov 2023 07:08:18 -0600
Subject: [PATCH] [Clang] Introduce scoped variants of GNU atomic functions
Summary:
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/72889
>From d06171561581d9d15c14f756c8999b478e1d769e Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 20 Nov 2023 10:12:04 -0600
Subject: [PATCH 1/2] [LinkerWrapper] Accenp some neede COFF linker argument
Summar
jhuber6 wrote:
Do COFF / Windows libraries have a special file magic / handling? I may need to
update the extraction code to handle those in a later patch.
https://github.com/llvm/llvm-project/pull/72889
___
cfe-commits mailing list
cfe-commits@lists.
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/73177
Summary:
This patch adds support for registering texture / surface variables from
CUDA / HIP. Additionally, we now properly track the `extern` and `const`
flags that are also used in these runtime functions.
This
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/72889
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/70462
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -4078,8 +4092,20 @@ OpenMPIRBuilder::createTargetInit(const
LocationDescription &Loc, bool IsSPMD) {
Constant *DebugIndentionLevelVal = ConstantInt::getSigned(Int16, 0);
Function *Kernel = Builder.GetInsertBlock()->getParent();
- auto [MinThreadsVal, MaxThreadsVal] =
@@ -339,9 +339,33 @@ Error GenericKernelTy::init(GenericDeviceTy &GenericDevice,
ImagePtr = &Image;
- PreferredNumThreads = GenericDevice.getDefaultNumThreads();
+ // Retrieve kernel environment object for the kernel.
+ GlobalTy KernelEnv(std::string(Name) + "_kernel_env
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/70383
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 approved this pull request.
Lots of churn but looks straightforward enough. Few nits.
https://github.com/llvm/llvm-project/pull/70383
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mail
jhuber6 wrote:
> > This is not being handled for AMDGPU Targets.
>
> I'm assuming this is an artifact of passing all arguments both the host
> target and the offload target? @jhuber6 what's the correct way of filtering
> out irrelevant codegen options?
I don't know what the desired behavior i
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/70760
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/70799
Summary:
This patch changes the code generation to not emit the stack protector
metadata on unsupported architectures. The issue was caused by system
toolchains emitting stack protector option by default which wou
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/70799
>From c791e527ee388659b35707816c0a67bee66dd0da Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 31 Oct 2023 08:12:01 -0500
Subject: [PATCH] [StackProtector] Do not emit the stack protector on GPU
architect
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/70799
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Is there some reason stack protectors don't make sense on GPU targets? Or is
> the issue just that the GPU targets in question don't have the necessary
> runtime support?
It's more of the latter as GPUs don't really have much of a runtime. That's why
I didn't want to complete
jhuber6 wrote:
Small change that affects a lot of tests. LG once these are fixed:
```
MLIR :: Target/LLVMIR/omptarget-region-device-llvm.mlir
MLIR :: Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir
MLIR :: Target/LLVMIR/omptarget-declare-target-llvm-device.mlir
```
https://github.com
https://github.com/jhuber6 approved this pull request.
LG
https://github.com/llvm/llvm-project/pull/70401
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Joseph Huber
Date: 2023-11-01T07:47:25-05:00
New Revision: 47d9fbc04b91fb03b6da294e82c2fb4bca6b6343
URL:
https://github.com/llvm/llvm-project/commit/47d9fbc04b91fb03b6da294e82c2fb4bca6b6343
DIFF:
https://github.com/llvm/llvm-project/commit/47d9fbc04b91fb03b6da294e82c2fb4bca6b6343.diff
@@ -3086,10 +3139,14 @@ Error AMDGPUKernelTy::launchImpl(GenericDeviceTy
&GenericDevice,
// Only COV5 implicitargs needs to be set. COV4 implicitargs are not used.
if (getImplicitArgsSize() == sizeof(utils::AMDGPUImplicitArgsTy)) {
ImplArgs->BlockCountX = NumBlocks;
+
@@ -17468,19 +17468,19 @@ Value *EmitAMDGPUImplicitArgPtr(CodeGenFunction &CGF)
{
/// Emit code based on Code Object ABI version.
/// COV_4: Emit code to use dispatch ptr
/// COV_5: Emit code to use implicitarg ptr
-/// COV_NONE : Emit code to load a global variable "l
@@ -88,7 +88,7 @@ class TargetOptions {
COV_5 = 500,
};
/// \brief Code object version for AMDGPU.
- CodeObjectVersionKind CodeObjectVersion = CodeObjectVersionKind::COV_None;
+ CodeObjectVersionKind CodeObjectVersion = CodeObjectVersionKind::COV_4;
j
jhuber6 wrote:
Did this change anything for the `scoped_atomic_compare_exchange_n` variant I
added recently?
https://github.com/llvm/llvm-project/pull/74959
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman
jhuber6 wrote:
Is generic the best name here? I feel like that's going to be heavily
overloaded.
https://github.com/llvm/llvm-project/pull/75357
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cf
jhuber6 wrote:
> Is generic the best name here? I feel like that's going to be heavily
> overloaded. I'd much prefer a new architecture that just treats "SPIR-V" as a
> single architecture. E.g. `--offload-arch=spirv` or something.
https://github.com/llvm/llvm-project/pull/75357
_
jhuber6 wrote:
I feel like we should treat `spirv` in the same way we handle stuff like
`sm_90` in the `CudaArch` enum. (We should probably also rename that as it's
used for generic offloading now). OpenMP infers the triple from the arch, so in
the future when OpenMP can handle SPIR-V we can s
jhuber6 wrote:
> Perhaps we should consider prefixing it in some way (e.g. `hip-spirv` or
> `amd-spirv`) that leaves the door open for some special handling (enable a
> particular set of extensions only for amdgpu targeting SPIRV, try to deal
> with missing builtins etc.) / flexibility?
Unsur
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/75363
>From 2700151916b0fd91c793930127412af5690c9e41 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 13 Dec 2023 11:35:13 -0600
Subject: [PATCH 1/2] [LLVM] Add file magic detection for SPIR-V files.
Summary:
Mo
jhuber6 wrote:
Added a test, for whatever reason I had to do a completely clean build to get
the test to correctly pick up my changes to `Magic.cpp`, don't know why.
https://github.com/llvm/llvm-project/pull/75363
___
cfe-commits mailing list
cfe-comm
@@ -209,6 +210,13 @@ void AMDGCN::Linker::ConstructJob(Compilation &C, const
JobAction &JA,
if (JA.getType() == types::TY_LLVM_BC)
return constructLlvmLinkCommand(C, JA, Inputs, Output, Args);
+ if (Args.getLastArgValue(options::OPT_mcpu_EQ) == "generic") {
+llvm::
@@ -129,6 +129,22 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList
&Args) const {
void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
const ArgList &DriverArgs, ArgStringList &CC1Args) const {
HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+ CC1Args
@@ -129,6 +129,22 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList
&Args) const {
void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
const ArgList &DriverArgs, ArgStringList &CC1Args) const {
HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+ CC1Args
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"],
"fopenmp-cuda-blocks-per-sm=">
Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"],
"fopenmp-cuda-teams-reduction-recs
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"],
"fopenmp-cuda-blocks-per-sm=">
Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"],
"fopenmp-cuda-teams-reduction-recs
@@ -47,7 +47,9 @@ PluginAdaptorTy::create(const std::string &Name) {
new PluginAdaptorTy(Name, std::move(LibraryHandler)));
if (auto Err = PluginAdaptor->init())
return Err;
- return PluginAdaptor;
jhuber6 wrote:
Does putting `std::move` here not
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/75528
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/75528
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/75757
Summary:
The CPU target currently inherits all the libraries from the normal link
job to ensure that it has access to the same envrionment that the host
does. However, this previously was not respecting argument l
@@ -458,44 +459,39 @@ void createRegisterFatbinFunction(Module &M,
GlobalVariable *FatbinDesc,
DtorFunc->setSection(".text.startup");
// Get the __cudaRegisterFatBinary function declaration.
- auto *RegFatTy = FunctionType::get(PointerType::getUnqual(C)->getPointerTo(),
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/73374
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -457,45 +457,42 @@ void createRegisterFatbinFunction(Module &M,
GlobalVariable *FatbinDesc,
IsHIP ? ".hip.fatbin_unreg" : ".cuda.fatbin_unreg", &M);
DtorFunc->setSection(".text.startup");
+ auto *PtrTy = PointerType::getUnqual(C);
+
// Get the
https://github.com/jhuber6 approved this pull request.
Thanks, this was never properly cleaned up after moving to opaque pointers.
https://github.com/llvm/llvm-project/pull/73374
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llv
jhuber6 wrote:
Ping
https://github.com/llvm/llvm-project/pull/73177
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
Ping
https://github.com/llvm/llvm-project/pull/73030
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 approved this pull request.
AFAIK this is the correct way to set debug information for something that
doesn't have a valid source location like a lot of generated OpenMP calls.
https://github.com/llvm/llvm-project/pull/73856
___
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/73856
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> My primary concerns here are:
>
> * It being likely these builtins will be superseded by something else
> once someone else tries to standardize this. Maybe this isn't a big deal...
> but maybe we want to choose names that are less likely to overlap with stuff
> anyone e
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/72280
>From ce494cd3f50720b6ba2b8a689f30272c09c06d00 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 6 Nov 2023 07:08:18 -0600
Subject: [PATCH] [Clang] Introduce scoped variants of GNU atomic functions
Summary:
https://github.com/jhuber6 commented:
Why couldn't you have put this logic in `addLTOOptions`? Seems like it's
copy-pasted verbatim at every site now.
AMD should handle very similarly to Linux here. They both compile down to
LLVM-IR and get sent to `ld.lld`.
https://github.com/llvm/llvm-proje
jhuber6 wrote:
> This fails for me on the host and the AMD GPU: GPU:
>
> ```
> # | :217:1: note: possible intended match here
> # | dat.datum[dat.arr[0][0]] = 5
> ```
>
> X86:
>
> ```
> # | :134:1: note: possible intended match here
> # | dat.datum[dat.arr[0][0]] = 5461
> ```
>
> The location
https://github.com/jhuber6 commented:
Needs a test. There should be some difference in codegen we can key off of.
https://github.com/llvm/llvm-project/pull/76571
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mai
jhuber6 wrote:
> Is the approach taken in this approach acceptable as opposed to the header
> solution I put up earlier?
Yes, it's pretty much exactly what I had in mind from my suggestion in the last
PR. Thanks.
https://github.com/llvm/llvm-project/pull/76571
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/76587
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -428,13 +428,22 @@ std::string getPGOFuncNameVarName(StringRef FuncName,
return VarName;
}
+bool isGPUProfTarget(const Module &M) {
+ const auto &triple = M.getTargetTriple();
+ return triple.rfind("nvptx", 0) == 0 || triple.rfind("amdgcn", 0) == 0 ||
+ triple.r
@@ -959,8 +959,14 @@ void CodeGenPGO::emitCounterIncrement(CGBuilderTy
&Builder, const Stmt *S,
unsigned Counter = (*RegionCounterMap)[S];
- llvm::Value *Args[] = {FuncNameVar,
- Builder.getInt64(FunctionHash),
+ // Make sure that pointer to globa
@@ -0,0 +1,21 @@
+//=== Profiling.h - OpenMP interface -- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apa
https://github.com/jhuber6 commented:
Some quick nits, will look more later.
https://github.com/llvm/llvm-project/pull/76587
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
>
> How about using `--offload=` which can take a target triple? E.g.
>
> * `--offload=spirv64-amd` or something like that: pick HIPAMD tool chain.
>
> * `--offload=spirv64`: pick HIPSPV tool chain.
>
>
> And also remove this
> [limitation](https://github.com/llvm/llv
jhuber6 wrote:
Test should probably show that IR is equivalent to `#pragma omp requires
unified_shared_memory` or however that's spelled. Basic documentation should be
provided by the help test in the new flag, but we probably have somewhere in
the OpenMP docs you could add it to if desired.
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/77003
Summary:
We use the OffloadBinary to contain bundled offloading objects used to
support many images / targets at the same time. The `__tgt_device_info`
struct used to contain a pointer to this underlying binary fo
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -58,6 +60,22 @@ class GlobalTy {
void setPtr(void *P) { Ptr = P; }
};
+typedef void *IntPtrT;
jhuber6 wrote:
What's the utility of this?
https://github.com/llvm/llvm-project/pull/76587
___
cfe-commits mailing
@@ -58,6 +60,22 @@ class GlobalTy {
void setPtr(void *P) { Ptr = P; }
};
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
https://github.com/jhuber6 approved this pull request.
TYVM for fixing this. There's a lot of hacky stuff we need to do here to make
it work, but it is what it is.
Guessing the other wrapped files are fine? I remember having problems with
`cytype` and `string` but I hopefully resolved a lot of
@@ -58,6 +60,22 @@ class GlobalTy {
void setPtr(void *P) { Ptr = P; }
};
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data
@@ -58,6 +60,22 @@ class GlobalTy {
void setPtr(void *P) { Ptr = P; }
};
+typedef void *IntPtrT;
jhuber6 wrote:
Okay. you should use the C++ `using` keyword instead of C's `typedef.
https://github.com/llvm/llvm-project/pull/76587
__
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/76587
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -163,3 +163,87 @@ Error
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
return Plugin::success();
}
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
https://github.com/jhuber6 approved this pull request.
Accepting this with Fortran makes sense. This option basically controls whether
or not the GPU toolchain will implicitly include the `libcgpu.a` static library
via `-lcgpu`. It defaults to on if it finds the `libc` wrapper headers in the
`
jhuber6 wrote:
> Makes sense to me, though this is not my area of expertise. Could you add a
> bit more elaborate test? Perhaps something that would check the linker
> invocation>?
I'm not familiar with how Fortran handles stuff here. It's tested in the
`clang` portion at least. The handling
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/77003
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> I am gonna sign off for the weekend as it's quite late here, so I'll reply in
> a little more detail on Monday and update the PR further. but I'd be happy to
> add a further flang test, although not too sure what it'd be, so suggestions
> are welcome.
>
> I tested this with a
@@ -19,8 +19,8 @@
// OPENMP-COFF: @__start_omp_offloading_entries = hidden constant [0 x
%struct.__tgt_offload_entry] zeroinitializer, section
"omp_offloading_entries$OA"
// OPENMP-COFF-NEXT: @__stop_omp_offloading_entries = hidden constant [0 x
%struct.__tgt_offload_ent
Author: Joseph Huber
Date: 2024-01-07T08:38:50-06:00
New Revision: 8f76f1816ea63b7cc28e150ba319ffbfe6351f9e
URL:
https://github.com/llvm/llvm-project/commit/8f76f1816ea63b7cc28e150ba319ffbfe6351f9e
DIFF:
https://github.com/llvm/llvm-project/commit/8f76f1816ea63b7cc28e150ba319ffbfe6351f9e.diff
@@ -58,6 +60,22 @@ class GlobalTy {
void setPtr(void *P) { Ptr = P; }
};
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data
jhuber6 wrote:
My use-case is more to be able to write functions like `is_wavefrontsize64()`
in regular C++ code. This would require some way to emit builtins for these.
I believe the use-case here is a workaround for the issues caused by library
ordering? I'm guessing this is related to the p
jhuber6 wrote:
> I was thinking of implementing libm/libc for nvptx, which would produce an IR
> library . We'll still need to keep the functions around if they are not used
> explicitly, because we may need them to fulfill libcalls later in the
> compilation pipeline. Sort of a libdevice repl
@@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr {
let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
}
+def AMDGPULibFun : InheritableAttr {
jhuber6 wrote:
Why isn't this a `TargetSpecificAttr`? We should have one for AMDGPU.
@@ -2693,6 +2693,17 @@ An error will be given if:
}];
}
+def AMDGPULibFunDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target
+to indicate it is a library functio
jhuber6 wrote:
> > An AMDGPU library function is not internalized and can be used to fullfill
> > calls generated by LLVM passes or instruction selection.
>
> I am confused by the description of "internalized". Do you refer to LTO
> internalization? You can leverage `llvm.used` to disable LTO
Author: Joseph Huber
Date: 2022-06-29T09:34:09-04:00
New Revision: 56ab966a04dd22570fcb18276e2409c94e82c571
URL:
https://github.com/llvm/llvm-project/commit/56ab966a04dd22570fcb18276e2409c94e82c571
DIFF:
https://github.com/llvm/llvm-project/commit/56ab966a04dd22570fcb18276e2409c94e82c571.diff
Author: Joseph Huber
Date: 2022-06-29T14:48:39-04:00
New Revision: 34fc1db9a8b22300a90e71fe7285501e7bcdc90e
URL:
https://github.com/llvm/llvm-project/commit/34fc1db9a8b22300a90e71fe7285501e7bcdc90e
DIFF:
https://github.com/llvm/llvm-project/commit/34fc1db9a8b22300a90e71fe7285501e7bcdc90e.diff
Author: Joseph Huber
Date: 2022-06-29T15:04:26-04:00
New Revision: f892ddb3be640f477fc9acef55e7bd613fc27acf
URL:
https://github.com/llvm/llvm-project/commit/f892ddb3be640f477fc9acef55e7bd613fc27acf
DIFF:
https://github.com/llvm/llvm-project/commit/f892ddb3be640f477fc9acef55e7bd613fc27acf.diff
101 - 200 of 2664 matches
Mail list logo