================ @@ -1562,6 +1562,23 @@ def HIPManaged : InheritableAttr { let Documentation = [HIPManagedAttrDocs]; } +def CUDAClusterDims : InheritableAttr { + let Spellings = [GNU<"cluster_dims">, Declspec<"__cluster_dims__">]; + let Args = [ExprArgument<"X">, ExprArgument<"Y", 1>, ExprArgument<"Z", 1>]; + let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; + let LangOpts = [CUDA]; + let Documentation = [CUDAClusterDimsAttrDoc]; +} + +def CUDANoCluster : InheritableAttr { + let Spellings = [GNU<"no_cluster">, Declspec<"__no_cluster__">]; ---------------- shiltian wrote:
If a kernel doesn't have `__cluster_dims__`, user can still enable the cluster feature at runtime during kernel launch. That means the compiler has to be conservative about cluster-related handling in the backend and assume the feature could be used. On the other hand, `__no_cluster__` tells the compiler the cluster feature will not be enabled at runtime. This lets the backend optimize if needed. For AMDGPU, it helps the compiler avoid querying certain registers when lowering some workgroup-related intrinsics. https://github.com/llvm/llvm-project/pull/156686 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits