[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)

Fraser Cormack via cfe-commits Wed, 09 Oct 2024 07:22:48 -0700

frasercrmck wrote:

> > > Seems reasonable, which architectures require this? I know that NVIDIA 
> > > deprecated the 32-bit `nvptx` target in CUDA 12 or something.
> > 
> > 
> > I'm not an expert on CUDA but, AFAICT, even in 64-bit CUDA, certain 
> > pointers such as those pointing to shared memory are 32 bit, because the 
> > size of shared memory is somewhere in the kB range. This generates better 
> > code, fewer registers, etc. I'm not sure why the option isn't enabled by 
> > default, personally - it seems like `nvcc` is doing this by default.
> > I was just playing with the option downstream and noticed this issue.
> 
> I figured it was something like that, since it saves a register per address. 
> I don't know the history for why this isn't the default, it's pretty much 
> just a data layout modifier to state that certain address spaces are 32-bit. 
> Maybe @Artem-B or @jlebar can comment.


Just threw together a nonsensical example for godbolt: 
https://godbolt.org/z/bhdEhrxd7. Notice the `mov.u32 %r7, As`, etc.

https://github.com/llvm/llvm-project/pull/111682
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)

Reply via email to