================ @@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module &M, GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module &M, ArrayRef<ArrayRef<char>> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( + Module &M, ArrayRef<ArrayRef<char>> Images, + std::optional<EntryArrayTy> EntryArray) const { + GlobalVariable *Desc = createBinDesc( + M, Images, + EntryArray + ? *EntryArray + : offloading::getOffloadEntryArray(M, "omp_offloading_entries"), ---------------- fabianmcg wrote:
I see what you mean, first some broader context, this patch is also part of a patch series that will add GPU compilation for OMP operations in MLIR without the need for `flang` or `clang`, which is not currently possible. This series also enables to JIT OMP operations in MLIR. The goal of the series is to make OMP target functional in MLIR as a standalone. I allow the passage of a custom entry array because ORC JIT doesn't fully support `__start`, `__stop` symbols for grouping section data. My solution was allowing the custom entry array, so in MLIR I build the full entry array and never rely on sections, this applies to OMP, CUDA and HIP. Thus we have that the following MLIR: ``` module attributes {gpu.container_module} { gpu.binary @binary <#gpu.offload_embedding<cuda>> [#gpu.object<#nvvm.target, bin = "BLOB">] llvm.func @func() { %1 = llvm.mlir.constant(1 : index) : i64 gpu.launch_func @binary::@hello blocks in (%1, %1, %1) threads in (%1, %1, %1) : i64 gpu.launch_func @binary::@world blocks in (%1, %1, %1) threads in (%1, %1, %1) : i64 llvm.return } } ``` Produces: ``` @__begin_offload_binary = internal constant [2 x %struct.__tgt_offload_entry] [%struct.__tgt_offload_entry { ptr @binary_Khello, ptr @.omp_offloading.entry_name, i64 0, i32 0, i32 0 }, %struct.__tgt_offload_entry { ptr @binary_Kworld, ptr @.omp_offloading.entry_name.2, i64 0, i32 0, i32 0 }] @__end_offload_binary = internal constant ptr getelementptr inbounds (%struct.__tgt_offload_entry, ptr @__begin_offload_binary, i64 2) @.fatbin_image.binary = internal constant [4 x i8] c"BLOB", section ".nv_fatbin" @.fatbin_wrapper.binary = internal constant %fatbin_wrapper { i32 1180844977, i32 1, ptr @.fatbin_image.binary, ptr null }, section ".nvFatBinSegment", align 8 @.cuda.binary_handle.binary = internal global ptr null @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @.cuda.fatbin_reg.binary, ptr null }] @binary_Khello = weak constant i8 0 @.omp_offloading.entry_name = internal unnamed_addr constant [6 x i8] c"hello\00" @binary_Kworld = weak constant i8 0 @.omp_offloading.entry_name.2 = internal unnamed_addr constant [6 x i8] c"world\00" ... ``` And this works. https://github.com/llvm/llvm-project/pull/78057 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits