jdoerfert added a comment. In D47849#1435770 <https://reviews.llvm.org/D47849#1435770>, @hfinkel wrote:
> We need to make progress on this, and I'd like to suggest a path forward... > > First, we have a fundamental problem here: Using host headers to declare > functions for the device execution environment isn't sound. Those host > headers can do anything, and while some platforms might provide a way to make > the host headers more friendly (e.g., by defining __NO_MATH_INLINES), these > mechanisms are neither robust nor portable. Thus, we should not rely on host > headers to define functions that might be available on the device. However, > even when compiling for the device, code meant only for host execution must > be semantically analyzable. This, in general, requires the host headers. So > we have a situation in which we must both use the host headers during device > compilation (to keep the semantic analysis of the surrounding host code > working) and also can't use the host headers to provide definitions for use > for device code (e.g., because those host headers might provide definitions > relying on host inline asm, intrinsics, using types not lowerable in device > code, could provide declarations using linkage-affecting attributes not > lowerable for the device, etc.). > > This is, or is very similar to, the problem that the host/device overloading > addresses in CUDA. It is also the problem, or very similar to the problem, > that the new OpenMP 5 `declare variant` directive is intended to address. > Johannes and I discussed this earlier today, and I suggest that we: > > 1. Add a math.h wrapper to clang/lib/Headers, which generally just does an > include_next of math.h, but provides us with the ability to customize this > behavior. Writing a header for OpenMP on NVIDIA GPUs which is essentially > identical to the math.h functions in __clang_cuda_device_functions.h would be > unfortunate, and as CUDA does provide the underlying execution environment > for OpenMP target offload on NVIDIA GPUs, duplicative even in principle. We > don't need to alter the default global namespace, however, but can include > this file from the wrapper math.h. I imagine this to look sth along the lines of: // File: clang/lib/Headers/math.h #ifdef CUDA #include "CUDA_INCLUDE_DIR/cuda_math.h" #elifdef ... ... #endif #include_next "math.h" So a clang internal `math.h` wrapper which, depending on the target, includes all "math.h" headers in the right order. The overload resolution should pick the right version even if there are multiple declared. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D47849/new/ https://reviews.llvm.org/D47849 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits