[clang] [llvm] [SPIRV][RFC] Rework / extend support for memory scopes (PR #106429)

Vyacheslav Levytskyy via cfe-commits Fri, 06 Sep 2024 01:47:28 -0700

VyacheslavLevytskyy wrote:

Thank you for the PR! I'd like to better understand motivation and 
justification of SPIR-V BE-related changes though. The goal would be to 
understand whether AllSvmDevices is indeed a better choice (for whom?) than 
Device as a default mem scope value in SPIR-V BE.
 
1) Questions to the description of the PR.
 
> "These were previously unconditionally lowered to Device scope, which is can 
> be too conservative and possibly incorrect."
The claim is not justified by any docs/specs. Why Device scope is incorrect as 
a default? In my opinion, it's AllSvmDevices that looks like a conservative 
choice that may lead to performance degradation in general case when we change 
the default without notifying customers. Or, we may say that potential 
performance changes may depend on a vendor-specific behavior in this case.
 
> "Furthermore, the default / implicit scope is changed from Device (an OpenCL 
> assumption) to AllSvmDevices (aka System), since the SPIR-V BE is not OpenCL 
> specific / can ingest IR coming from other language front-ends."
What I know without additional references to other docs/specs is that Device is 
default by OpenCL spec 
(https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#atomic-functions).
 It would help if you can provide references where AllSvmDevices is a 
preferable choice, so that we are able to compare and figure out the best 
default for the Computational flavor of SPIR-V. For sure, SPIR-V BE is not 
OpenCL (=Device) specific, and it's also not specific to any particular vendor 
or computational framework. I've seen usages of AllSvmDevices as default in the 
code base (for example, in 
https://github.com/llvm/llvm-project/blob/319e8cd201e6744199da377fba237dd276063e49/clang/lib/CodeGen/Targets/AMDGPU.cpp#L537),
 but it seems not enough to flip the default over.
 
> "OpenCL defaulting to Device scope is now reflected in the front-end handling 
> of atomic ops, which seems preferable."
Changes in clang part looks really good to me. However, when we add to it 
changes in SPIR-V part of the code base, things look less optimistic, because 
what this PR means by "the front-end handling of atomic ops" is the upstream 
clang only, whereas actual choices of a front-end are more versatile, and users 
coming to SPIR-V by other paths would get a sudden change of behavior in the 
worst case (e.g., MLIR input for the GenAI domain).


===
 
2) If it's acceptable to split this PR into two separate PR's (clang and 
SPIR-V), I'd gladly support changes in clang part, it makes sense for me. At 
the moment, however, I have objections against SPIR-V Backend changes as they 
are represented in the PR:
 
* This PR looks like a breaking change that would flip over the default value 
of mem scope for all environments except for OpenCL and may have a potentially 
negative impact on an unknown number of projects/customers. I'd guess that 
OpenCL would not notice the difference, because path that goes via upstream 
clang front-end redefines default mem scope as Device. All other toolchains 
just get a breaking change in the form of the AllSvmDevices default. 
clang-related changes do not help to smooth this, because SPIRV BE should 
remain agnostic towards front-ends, frameworks, etc.
 
* A technical comment is that the proposed implementation in SPIR-V part is 
less efficient that existing. It compares strings rather than integers and 
fills in scope names on each call to the getMemScope() function, whereas 
existing implementation does it just once per a machine function.

* A terminology (the choice of syncscope names) is debatable. The closest thing 
in specs that I see is 
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_scope_id. I 
don't see any references to "singlethread" in the specs. Name "workitem" 
(spelling precisely as "work-item") is used at least in the official Khronos 
documents (see for example 
https://registry.khronos.org/SPIR-V/specs/1.0/SPIR-V-execution-and-memory-model.pdf).
 "all_svm_devices" is not mentioned in the specs at all (there is only the 
"CrossDevice" term).
 
===

For now, I'd rather see an eventual solution in the form of further 
classification of the computational flavor of SPIR-V (not just Compute vs. 
Vulkan but breaking Compute part further where this is required) -- comparing 
to this sudden change of the default in favor of any incarnation of Compute 
targets. As the first approach, all SPIR-V-related changes may require just a 
short snippet of the kind "if TheTriple is XXX-specific then use CrossDevice 
instead of Device" and minor rename of syncscope names ("subgroup", for 
example, indeed makes more sense than "sub_group"). This would probably require 
a description in the SPIRVUsage doc as well to avoid confusion among customers. 
Anyway, I'd be glad to talk out a reasonable way forward to get a better 
solution than we have now, if needed.

https://github.com/llvm/llvm-project/pull/106429
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SPIRV][RFC] Rework / extend support for memory scopes (PR #106429)

Reply via email to