jdoerfert accepted this revision. jdoerfert added a comment. This revision is now accepted and ready to land.
LG, with some things to address before the merge though. Didn't we have a pass to expand shared memory (and such)? ================ Comment at: clang/lib/Basic/TargetInfo.cpp:155 + + if (Triple.getVendor() == llvm::Triple::OpenMP_VGPU) + AddrSpaceMap = &llvm::omp::OpenMPVGPUAddrSpaceMap; ---------------- use isOpenMPVGPU ================ Comment at: clang/lib/Basic/Targets/X86.h:395 + return llvm::omp::VirtualGpuGridValues; + } }; ---------------- Do we need the changes in this file at all? I couldn't see why. ================ Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1125 + llvm::GlobalValue::LinkageTypes Linkage) { + if (CGM.getTarget().getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) + return CGOpenMPRuntime::createOffloadEntry(ID, Addr, Size, Flags, Linkage); ---------------- isOpenMPVGPU ================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:252 default: - if (LangOpts.OpenMPSimd) + if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) + OpenMPRuntime.reset(new CGOpenMPRuntimeGPU(*this)); ---------------- isOpenMPVGPU ================ Comment at: clang/lib/Driver/ToolChains/Gnu.cpp:3076 + + if (getTriple().getVendor() == llvm::Triple::OpenMP_VGPU) { + std::string BitcodeSuffix = getTripleString() + "-openmp_vgpu"; ---------------- isOpenMPVGPU ================ Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:323 +constexpr uint32_t UNSET = 0; +constexpr uint32_t SET = 1; + ---------------- Remove these. Also the TODO below (copied from somewhere) ================ Comment at: openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.cpp:85 + CTAEnvironmentTy *CTAE) + : ThreadIdInWarp(Idx++ % WE->getNumThreads()), + ThreadIdInBlock(WE->getWarpId() * WE->getNumThreads() + ThreadIdInWarp), ---------------- This is racy, I think. Can we use atomic_add for all these Idx updates or pass the Id from the outside? ================ Comment at: openmp/libomptarget/plugins/vgpu/src/ThreadEnvironmentImpl.h:118 + + // FIXME: This is wrong + LaneMaskTy getActiveMask() const; ---------------- at least add more information what the problem and potential solutions are. ================ Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:271 + ThreadIdx++) { + Threads.emplace_back([this, GlobalThreadIdx, CTAEnv, WarpEnv]() { + ThreadEnvironment = new ThreadEnvironmentTy(WarpEnv, CTAEnv); ---------------- Move the lambda into a helper function. indention of 12 is too much. ================ Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:313 + }); + GlobalThreadIdx = (GlobalThreadIdx + 1) % NumThreads; + } ---------------- When do we have more threads than NumThreads? ================ Comment at: openmp/libomptarget/plugins/vgpu/src/rtl.cpp:554 + +int32_t __tgt_rtl_data_delete(int32_t device_id, void *tgt_ptr) { + free(tgt_ptr); ---------------- if we need for submit/retrieve, I'd assume to wait here too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113359/new/ https://reviews.llvm.org/D113359 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits