On 24/07/2025 16:49, Tobias Burnus wrote:
Andrew Stubbs wrote:
On 24/07/2025 14:25, Tobias Burnus wrote:
+/* Device requires CDNA1-style manually inserted wait states for
AVGPRs. */
+#define TARGET_AVGPR_CDNA3_NOPS TARGET_CDNA3
This is not for CDNA1, and not for AVGPRS.
I have deleted it and use now TARGET_CDNA3 directly.
I've been trying *not* to do that. We ended up with lots of unexplained
TARGET_* instances where each one had to be reviewed whenever a new
device was added. When I did the .def file conversion I went through
all those instances and classified them all, and yet some remained
unexplained (c.f. "what does this mean?" comment).
The only thing wrong with what you had before was the name (and the
comment was an unmodified cut-and-paste).
In theory, when we add a new arch we need only review the feature list,
not every use-case.
+ else if ((prev_insn->age + nops_rqd) < 1
+ && prev_insn->unit == UNIT_VECTOR
+ && iunit == UNIT_VECTOR
+ && hard_reg_set_intersect_p
+ (depregs, reg_class_contents[(int) SGPR_REGS]))
+ nops_rqd = 1 - prev_insn->age;
Should this conditional not depend on CDNA3?
Yes.
+ /* VALU writes EXEC followed by VALU DPP op requires 5 nop. */
+ if ((prev_insn->age + nops_rqd) < 5
+ && itype == TYPE_VOP_DPP
+ && prev_insn->unit == UNIT_VECTOR
+ && TEST_HARD_REG_BIT (prev_insn->writes, EXECZ_REG))
+ nops_rqd = 5 - prev_insn->age;
Likewise?
I think we want to have this for MI200 as well; its ISA manual has in
"4.5. Manually Inserted Wait States (NOPs)":
"First Instruction" - "VALU writes EXEC" / "Second Instruction" - "VALU
DPP op"
"Wait" - "5" / "Notes" - "ALU does not forward EXEC to
DPP.".
https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/
instruction-set-architectures/instinct-mi200-cdna2-instruction-set-
architecture.pdf
The same wording also appears forGFX906/Vega:
https://gpuopen.com/download/Vega_7nm_Shader_ISA_26November2019.pdf
For older Vega, see the archived version, e.g.
https://web.archive.org/web/20230309032715/http://developer.amd.com/
wordpress/media/2013/12/Vega_Shader_ISA_28July2017.pdf
which also has this wording in "4.5. Manually Inserted Wait States (NOPs)"
And forGFX908/MI100, https://developer.amd.com/wp-content/resources/
CDNA1_Shader_ISA_14December2020.pdf * * * While RDNA3 states: "5.6. Data
Dependency Resolution" "Inserting S_NOP is not required to achieve
correct operation.", https://www.amd.com/content/dam/amd/en/documents/
radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-
set-architecture-feb-2023_0.pdf
Likewise RNDA2: https://www.amd.com/content/dam/amd/en/documents/radeon-
tech-docs/instruction-set-architectures/rdna2-shader-instruction-set-
architecture.pdf
Likewise RNDA3.5: https://www.amd.com/content/dam/amd/en/documents/
radeon-tech-docs/instruction-set-architectures/
rdna35_instruction_set_architecture.pdf
However, RNDA4 has: "5.7. Data Dependency Resolution" "The programmer is
rarely required to insert waitstates (S_NOP or unrelated instructions)."
See 5.7 and especially "5.7.2. Specific Dependency Cases", https://
www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-
set-architectures/rdna4-instruction-set-architecture.pdf
* * *
Updated patch attached. Changes: - Use TARGET_CDNA3 directly (and remove
bogusly named #define)
- Add one missing 'if (TARGET_CDNA3'
- Keep using the added 'exec + DPP' code for all GPUs (reaching that spot)
- Skip the s_nop-adding code for RNDA2 and RNDA3
- Add an assert for RNDA4 as reminder to change this, once we add RNDA4
support.
Does this look sensible?
Tobias
+ /* RDNA4 (not yet implemented) differs from RNDA 2/3/3.5 and requires some
+ s_nop, see 5.7 and esp. 5.7.2. in its ISA manual.
+ The assert here is a reminder to add those. */
+ STATIC_ASSERT (ISA_CDNA1 - ISA_RDNA3 == 1);
This is fine because it's well documented and I get that it's self
triggering.
+ /* RDNA 2, 3 and 3.5 require no manually inserted wait states. */
+ if (TARGET_RDNA2_PLUS)
+ return;
But, this is another "magic" arch check. Please use a named feature
flag; perhaps "TARGET_MANUALLY_INSERTED_WAIT_STATES".
Andrew