================
@@ -2326,6 +2326,20 @@ bool 
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
     }
 #endif
 
+    if (ST->isPreciseMemoryEnabled()) {
+      AMDGPU::Waitcnt Wait;
+      if (WCG == &WCGPreGFX12)
+        Wait = AMDGPU::Waitcnt(0, 0, 0, 0);
----------------
Pierre-vh wrote:
I was looking at https://github.com/ROCm/ROCm-CompilerSupport/issues/66 and it 
made me wonder, why do we have to emit all zeroes instead of just emitting 
what's in `ScoreBrackets`? Is there an advantage?

I'm wondering if this should just emit `ScoreBrackets`, then `+precise-memory` 
+ `-amdgpu-waitcnt-forcezero` need to be used together achieve the behavior we 
have here?


https://github.com/llvm/llvm-project/pull/79236
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to