================
@@ -1562,6 +1562,23 @@ def HIPManaged : InheritableAttr {
   let Documentation = [HIPManagedAttrDocs];
 }
 
+def CUDAClusterDims : InheritableAttr {
+  let Spellings = [GNU<"cluster_dims">, Declspec<"__cluster_dims__">];
+  let Args = [ExprArgument<"X">, ExprArgument<"Y", 1>, ExprArgument<"Z", 1>];
+  let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
+  let LangOpts = [CUDA];
+  let Documentation = [CUDAClusterDimsAttrDoc];
+}
+
+def CUDANoCluster : InheritableAttr {
+  let Spellings = [GNU<"no_cluster">, Declspec<"__no_cluster__">];
----------------
shiltian wrote:

If a kernel doesn't have `__cluster_dims__`, user can still enable the cluster 
feature at runtime during kernel launch. That means the compiler has to be 
conservative about cluster-related handling in the backend and assume the 
feature could be used.

On the other hand, `__no_cluster__` tells the compiler the cluster feature will 
not be enabled at runtime. This lets the backend optimize if needed. For 
AMDGPU, it helps the compiler avoid querying certain registers when lowering 
some workgroup-related intrinsics.

https://github.com/llvm/llvm-project/pull/156686
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to