[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)

Joseph Huber via cfe-commits Wed, 09 Oct 2024 07:20:44 -0700

jhuber6 wrote:

> > Seems reasonable, which architectures require this? I know that NVIDIA 
> > deprecated the 32-bit `nvptx` target in CUDA 12 or something.
> 
> I'm not an expert on CUDA but, AFAICT, even in 64-bit CUDA, certain pointers 
> such as those pointing to shared memory are 32 bit, because the size of 
> shared memory is somewhere in the kB range. This generates better code, fewer 
> registers, etc. I'm not sure why the option isn't enabled by default, 
> personally - it seems like `nvcc` is doing this by default.
> 
> I was just playing with the option downstream and noticed this issue.


I figured it was something like that, since it saves a register per address. I 
don't know the history for why this isn't the default, it's pretty much just a 
data layout modifier to state that certain address spaces are 32-bit. Maybe 
@Artem-B or @jlebar can comment.

https://github.com/llvm/llvm-project/pull/111682
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)

Reply via email to