tra added a comment.

In https://reviews.llvm.org/D42319#983377, @jlebar wrote:

> How does this affect e.g. calling memcpy()?  There isn't a standard library 
> implementation of this on nvptx, but we do want calls to memcpy() to be 
> lowered to llvm.memcpy so that they can be optimized.


We implement memcpy as a call to __builtin_memcpy() which gets code-gen-ed as 
usual.  NVPTX also lowers all memcpy/memset/memmove as loads/stores, so those 
don't need external library. This behavior is not affected by this patch.

This patch's goal is to prevent clang codegen-ing its idea of the library 
builtin function while ignoring the implementation we've provided in the 
headers for device side.

Original issue I had was triggered by code roughly similar to this:

  extern "C" __device__ int logf(float a) { return __nv_logf(a); }
  __global__ void kernel() { logf(0.0f); }

In the AST, the kernel was calling the logf functions above However, when clang 
generated code, it considered that logf is a library builtin with known 
semantics and happily codegen'ed a call to @llvm.log.f32, which NVPTX back-end 
has no way to lower. The patch adds a safety net in clang so it does not 
generate code for builtins which we have disabled (or can't handle) in NVPTX.


https://reviews.llvm.org/D42319



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to