jdoerfert added a comment.

In D47849#1435770 <https://reviews.llvm.org/D47849#1435770>, @hfinkel wrote:

> We need to make progress on this, and I'd like to suggest a path forward...
>
> First, we have a fundamental problem here: Using host headers to declare 
> functions for the device execution environment isn't sound. Those host 
> headers can do anything, and while some platforms might provide a way to make 
> the host headers more friendly (e.g., by defining __NO_MATH_INLINES), these 
> mechanisms are neither robust nor portable. Thus, we should not rely on host 
> headers to define functions that might be available on the device. However, 
> even when compiling for the device, code meant only for host execution must 
> be semantically analyzable. This, in general, requires the host headers. So 
> we have a situation in which we must both use the host headers during device 
> compilation (to keep the semantic analysis of the surrounding host code 
> working) and also can't use the host headers to provide definitions for use 
> for device code (e.g., because those host headers might provide definitions 
> relying on host inline asm, intrinsics, using types not lowerable in device 
> code, could provide declarations using linkage-affecting attributes not 
> lowerable for the device, etc.).
>
> This is, or is very similar to, the problem that the host/device overloading 
> addresses in CUDA. It is also the problem, or very similar to the problem, 
> that the new OpenMP 5 `declare variant` directive is intended to address. 
> Johannes and I discussed this earlier today, and I suggest that we:
>
> 1. Add a math.h wrapper to clang/lib/Headers, which generally just does an 
> include_next of math.h, but provides us with the ability to customize this 
> behavior. Writing a header for OpenMP on NVIDIA GPUs which is essentially 
> identical to the math.h functions in __clang_cuda_device_functions.h would be 
> unfortunate, and as CUDA does provide the underlying execution environment 
> for OpenMP target offload on NVIDIA GPUs, duplicative even in principle. We 
> don't need to alter the default global namespace, however, but can include 
> this file from the wrapper math.h.


I imagine this to look sth along the lines of:

  // File: clang/lib/Headers/math.h
  
  #ifdef CUDA
    #include "CUDA_INCLUDE_DIR/cuda_math.h"
  #elifdef ...
    ...
  #endif
  
  #include_next "math.h"

So a clang internal `math.h` wrapper which, depending on the target, includes 
all "math.h" headers in the right order.
The overload resolution should pick the right version even if there are 
multiple declared.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D47849/new/

https://reviews.llvm.org/D47849



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D47849: [Op... Johannes Doerfert via Phabricator via cfe-commits

Reply via email to