Hahnfeld added a comment.

So the scheme is: `pow` is defined in `__clang_openmp_math.h` to call 
`__kmpc_pow`. This lives in `libomptarget-nvptx` (both bc and static lib) and 
just calls `pow` which works because `nvcc` and Clang in CUDA mode make sure 
that the call gets routed into `libdevice`?

Did you test that something like `pow(d, 2)` is optimized by LLVM to `d * d`? 
There's a pass doing so (can't recall the name) and from my previous attempts 
it didn't work well if you hid the function name instead of the known `pow` one.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60907/new/

https://reviews.llvm.org/D60907



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to