yaxunl added a comment.

In D102507#2838981 <https://reviews.llvm.org/D102507#2838981>, @tra wrote:

> The key difference between C++ and CUDA/HIP, as implemented in clang, is that 
> `__host__` and `__device__` attributes are considered during function 
> overloading in CUDA and HIP, so `__host__ void foo()`, `__device__ void 
> foo()` and `__host__ __device__ void foo()` are three different functions and 
> not redeclarations of the same function. Details of the original proposal are 
> here:  https://goo.gl/EXnymm.
>
> In D102507#2838776 <https://reviews.llvm.org/D102507#2838776>, @yaxunl wrote:
>
>> 2. Some libc++ functions are mostly common for device or host with minor 
>> differences. For such functions, we can make them `__device__ __host__` and 
>> use `#if __HIP_DEVICE_COMPILE__` (indicating device compilation) for the 
>> minor difference in the function body.
>
> I think we should rely on target overloading when possible, instead of the 
> preprocessor. Minimizing the differences between the code seen by compiler 
> during host and device side compilation will minimize potential issues.
> Which approach we'll end up using is an implementation detail.

Agree.

>> 3. Some libc++ functions have different implementations for device and host. 
>> We can leave these host functions as they are and adding overloaded 
>> `__device__` functions.
>>
>> There are two ways to mark libc++ functions as `__device__ __host__`:
>>
>> 1. Define a macro which expands to empty for non-HIP programs and expands to 
>> `__device__ __host__` for HIP and add it to each libc++ function which is to 
>> be marked as `__device__ __host__`.
>
> One caveat of the overloading based on target attributes is that we can't 
> re-declare a function with `__device__ __host__` as compiler will see 
> attempted redeclaration as a function overload of a function w/o attributes 
> (implicitly `__host__`).

If we keep all the declarations consistent we should be fine.

>> 2. Define macros which expand to empty for non-HIP programs and expand to 
>> `#pragma clang force_cuda_host_device begin/end` for HIP and put them at the 
>> beginning and end of a file where all the functions are to be marked as 
>> `__device__ __host__`.
>>
>> We plan to implement libc++ support in HIP device compilation in a 
>> progressive approach, header by header, and document the supported libc++ 
>> headers. We will prioritize libc++ headers to support based on 1) user 
>> requests 2) whether it has already been supported through clang wrapper 
>> headers (patching) 4) usefulness for device execution 3) availability of 
>> lower level support with HIP runtime.
>
> All of the above applies to CUDA, modulo the macro names and some differences 
> in the builtins and the the functions provided (or not) by runtime on the GPU 
> side.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102507/new/

https://reviews.llvm.org/D102507

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to