================ @@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF, } #endif + if (ST->isPreciseMemoryEnabled()) { + AMDGPU::Waitcnt Wait; + if (WCG == &WCGPreGFX12) + Wait = AMDGPU::Waitcnt(0, 0, 0, 0); ---------------- Pierre-vh wrote:
I was looking at https://github.com/ROCm/ROCm-CompilerSupport/issues/66 and it made me wonder, why do we have to emit all zeroes instead of just emitting what's in `ScoreBrackets`? Is there an advantage? I'm wondering if this should just emit `ScoreBrackets`, then `+precise-memory` + `-amdgpu-waitcnt-forcezero` need to be used together achieve the behavior we have here? https://github.com/llvm/llvm-project/pull/79236 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits