[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
https://github.com/keryell commented: Quite interesting! At some point it would be nice to have some design document or documentation somewhere explaining how all these MLIR runners works, including this one. Globally this PR add a SYCL runner, but it is very specific for Intel Level 0. It would be nice to have in the future some generalization, like SYCL using OpenCL interoperability interface to run the SPIR-V kernels or even native kernels. https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -116,6 +116,7 @@ add_definitions(-DMLIR_ROCM_CONVERSIONS_ENABLED=${MLIR_ENABLE_ROCM_CONVERSIONS}) set(MLIR_ENABLE_CUDA_RUNNER 0 CACHE BOOL "Enable building the mlir CUDA runner") set(MLIR_ENABLE_ROCM_RUNNER 0 CACHE BOOL "Enable building the mlir ROCm runner") +set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir Sycl runner") keryell wrote: Please spell SYCL correctly. ```suggestion set(MLIR_ENABLE_SYCL_RUNNER 0 CACHE BOOL "Enable building the mlir SYCL runner") ``` One could argue that `mlir` should be spelled `MLIR` but the train seems to have left long time ago. :-) https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
https://github.com/keryell edited https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -0,0 +1,68 @@ +# CMake find_package() module for SYCL Runtime +# +# Example usage: +# +# find_package(SyclRuntime) keryell wrote: Shouldn't it ```suggestion # find_package(SYCLRuntime) ``` everywhere? https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -0,0 +1,223 @@ +//===- SyclRuntimeWrappers.cpp - MLIR SYCL wrapper library ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Implements C wrappers around the sycl runtime library. +// +//===--===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#ifdef _WIN32 +#define SYCL_RUNTIME_EXPORT __declspec(dllexport) +#else +#define SYCL_RUNTIME_EXPORT +#endif // _WIN32 + +namespace { + +template +auto catchAll(F &&func) { + try { +return func(); + } catch (const std::exception &e) { +fprintf(stdout, "An exception was thrown: %s\n", e.what()); +fflush(stdout); +abort(); + } catch (...) { +fprintf(stdout, "An unknown exception was thrown\n"); +fflush(stdout); +abort(); + } +} + +#define L0_SAFE_CALL(call) \ + { \ +ze_result_t status = (call); \ +if (status != ZE_RESULT_SUCCESS) { \ + fprintf(stdout, "L0 error %d\n", status); \ + fflush(stdout); \ + abort(); \ +} \ + } + +} // namespace + +static sycl::device getDefaultDevice() { + auto platformList = sycl::platform::get_platforms(); + for (const auto &platform : platformList) { +auto platformName = platform.get_info(); +bool isLevelZero = platformName.find("Level-Zero") != std::string::npos; +if (!isLevelZero) + continue; + +return platform.get_devices()[0]; + } + throw std::runtime_error("getDefaultDevice failed"); +} + +#pragma clang diagnostic push +#pragma clang diagnostic ignored "-Wglobal-constructors" + +// Create global device and context +sycl::device syclDevice = getDefaultDevice(); +sycl::context syclContext = sycl::context(syclDevice); + +#pragma clang diagnostic pop + +struct QUEUE { + sycl::queue syclQueue_; + + QUEUE() { syclQueue_ = sycl::queue(syclContext, syclDevice); } +}; + +static void *allocDeviceMemory(QUEUE *queue, size_t size, bool isShared) { + void *memPtr = nullptr; + if (isShared) { +memPtr = sycl::aligned_alloc_shared(64, size, syclDevice, syclContext); + } else { +memPtr = sycl::aligned_alloc_device(64, size, syclDevice, syclContext); + } + if (memPtr == nullptr) { +throw std::runtime_error("mem allocation failed!"); + } + return memPtr; +} + +static void deallocDeviceMemory(QUEUE *queue, void *ptr) { + sycl::free(ptr, queue->syclQueue_); +} + +static ze_module_handle_t loadModule(const void *data, size_t dataSize) { + assert(data); + ze_module_handle_t zeModule; + ze_module_desc_t desc = {ZE_STRUCTURE_TYPE_MODULE_DESC, + nullptr, + ZE_MODULE_FORMAT_IL_SPIRV, + dataSize, + (const uint8_t *)data, + nullptr, + nullptr}; + auto zeDevice = + sycl::get_native(syclDevice); + auto zeContext = + sycl::get_native(syclContext); + L0_SAFE_CALL(zeModuleCreate(zeContext, zeDevice, &desc, &zeModule, nullptr)); + return zeModule; +} + +static sycl::kernel *getKernel(ze_module_handle_t zeModule, const char *name) { + assert(zeModule); + assert(name); + ze_kernel_handle_t zeKernel; + sycl::kernel *syclKernel; + ze_kernel_desc_t desc = {}; + desc.pKernelName = name; + + L0_SAFE_CALL(zeKernelCreate(zeModule, &desc, &zeKernel)); + sycl::kernel_bundle kernelBundle = + sycl::make_kernel_bundle({zeModule}, + syclContext); + + auto kernel = sycl::make_kernel( + {kernelBundle, zeKernel}, syclContext); + syclKernel = new sycl::kernel(kernel); + return syclKernel; +} + +static void launchKernel(QUEUE *queue, sycl::kernel *kernel, size_t gridX, + size_t gridY, size_t gridZ, size_t blockX, + size_t blockY, size_t blockZ, size_t sharedMemBytes, + void **params, size_t paramsCount) { + auto syclGlobalRange = + ::sycl::range<3>(blockZ * gridZ, blockY * gridY, blockX * gridX); + auto syclLocalRange = ::sycl::range<3>(blockZ, blockY, blockX); + sycl::nd_range<3> syclNdRange( + sycl::nd_range<3>(syclGloba
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite( // descriptor. Type elementPtrType = this->getElementPtrType(memRefType); auto stream = adaptor.getAsyncDependencies().front(); + + auto isHostShared = rewriter.create( + loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared)); + Value allocatedPtr = - allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult(); + allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared}) + .getResult(); keryell wrote: Technically, SYCL provides a more abstract memory management with `sycl::buffer` and `sycl::accessor` defining an implicit asynchronous task graph. The allocation details are left to the implementation, asynchronous or synchronous allocation is left to the implementers. Here the lower-level synchronous USM memory management API of SYCL is used instead, similar to CUDA/HIP memory management. So, should the `async` allocation in the example be synchronous instead? https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite( // descriptor. Type elementPtrType = this->getElementPtrType(memRefType); auto stream = adaptor.getAsyncDependencies().front(); + + auto isHostShared = rewriter.create( + loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared)); + Value allocatedPtr = - allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult(); + allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared}) + .getResult(); keryell wrote: I guess that if the runtime uses actually synchronous allocation behind the scene and produces an always-ready async token, it works, even if non optimal. https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits