saiislam added inline comments.
================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17143-17145 + llvm::LoadInst *LD; + Constant *Offset, *Offset1; + Value *DP, *DP1; ---------------- arsenm wrote: > Move down to define and initialize There are multiple uses of the same identifier. Defining them four times looks odd. ================ Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:2550-2551 + Error Err = retrieveAllMemoryPools(); + if (Err) + return Plugin::error("Unable to retieve all memmory pools"); + ---------------- jhuber6 wrote: > This and below isn't correct. You can't discard an `llvm::Error` value like > this without either doing `consumeError(std::move(Err))` or > `toString(std::move(Err))`. However, you don't need to consume these in the > first place, they already contain the error message from the callee and > should just be forwarded. Removed the logic for preallocatedheap. ================ Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp:1752 + // if (auto Err = preAllocateDeviceMemoryPool()) + // return Err; + ---------------- jhuber6 wrote: > saiislam wrote: > > jhuber6 wrote: > > > Leftoever? > > No, it is not a left over. > > One of the fields in cov5 implicitikernarg is heap_v1 ptr. It should point > > to a 128KB zero-initialized block of coarse-grained memory on each device > > before launching the kernel. This code was working a while ago, but right > > now it is failing most likely due to some latest change in devicertl memory > > handling mechanism. > > I need to debug it with this patch, otherwise it will cause all target > > region code calling device-malloc to fail. > > I will try to fix it before the next revision. > Do we really need that? We only use a fraction of the existing implicit > arguments. My understanding is that most of these are more for runtime > handling for HIP and OpenCL while we would most likely want our own solution. > I'm assuming that the 128KB is not required for anything we use? I have removed the preallocatedheap work from this patch. ================ Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h:36 -// The implicit arguments of AMDGPU kernels. -struct AMDGPUImplicitArgsTy { - uint64_t OffsetX; - uint64_t OffsetY; - uint64_t OffsetZ; - uint64_t HostcallPtr; - uint64_t Unused0; - uint64_t Unused1; - uint64_t Unused2; +enum IMPLICITARGS : uint32_t { + COV4_SIZE = 56, ---------------- jhuber6 wrote: > saiislam wrote: > > jhuber6 wrote: > > > arsenm wrote: > > > > This is getting duplicated a few places, should it move to a support > > > > header? > > > > > > > > I don't love the existing APIs for this, I think a struct definition > > > > makes more sense > > > The other user here is my custom loader, @JonChesterfield has talked > > > about wanting a common HSA helper header for awhile now. > > > > > > I agree that the struct definition is much better. Being able to simply > > > allocate this size and then zero fill it is much cleaner. > > Defining a struct for whole 256 byte of implicitargs in cov5 was becoming a > > little difficult due to different sizes of various fields (2, 4, 6, 8, 48, > > 72 bytes) along with multiple reserved fields in between. It made sense for > > cov4 because it only had 7 fields of 8 bytes each, where we needed only 4th > > field in OpenMP runtime (for hostcall_buffer). > > > > Offset based lookups like the following allows handling/exposing only > > required fields across generations of ABI. > If we don't use it, just put it as `unused`. It's really hard to read as-is > and it makes it more difficult to just zero fill. I have reduced the fields to bare minimum required for OpenMP. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/ https://reviews.llvm.org/D139730 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits