https://bugs.llvm.org/show_bug.cgi?id=42989
Bug ID: 42989
Summary: amdgpu_flat_work_group_size is not a hint
Product: clang
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Documentation
Assignee: unassignedclangb...@nondot.org
Reporter: ekuznet...@live.com
CC: llvm-bugs@lists.llvm.org, richard-l...@metafoo.co.uk
The documentation
https://clang.llvm.org/docs/AttributeReference.html#amdgpu-flat-work-group-size
specifies that
"Clang supports the __attribute__((amdgpu_flat_work_group_size(<min>, <max>)))
attribute for the AMDGPU target. This attribute may be attached to a kernel
function definition and is an optimization hint."
An attribute is an optimization hint if its presence or absence does not change
the correctness of the program. Is this case, that's not true. When the
attribute is absent, the compiler assumes the default maximum workgroup size of
256. This was, I believe, correct several years ago. But nowadays the workgroup
size can legally go to 1024. In some situations, running a kernel without the
attribute at 1024 results in baffling and hard-to-track bugs (the compiler uses
group memory to offload some local variables, but does not allocate enough of
it for it to actually work).
Either the default maximum must be raised to 1024, or the documentation must be
amended to say that the attribute is mandatory in some situations.
See also https://github.com/ROCm-Developer-Tools/HIP/issues/1310 .
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs