jhuber6 wrote:

> Just a FYI, that recent NVIDIA GPUs have introduced a concept of [thread 
> block 
> cluster](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#thread-block-clusters).
>  We may need another level of granularity between the block and device.

Should be easy enough, though the numbers would no longer be incremental if we 
put it between there. It's somewhat difficult to decide what these things 
should be called. Also I was somewhat tempted to keep the names all the same 
length like the `__ATOMIC` ones are, but that might not be worth the effort.

That being said, As far as I'm aware the Nvidia backend doesn't handle scoped 
atomics at all yet, we simply emit `volatile` versions even when scopes exist 
in PTX.

https://github.com/llvm/llvm-project/pull/72280
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to