The number of AMD GPUs is huge - and, unfortunately, every GPU device
is potentially slightly different, requiring different code generation
either in some dusty corner case or for standard code.
As for several GPUs identical code can run (either all or when disabling
some features), AMD introduced with LLVM 19 some gfx*-generic targets.
GCC added support for gfx10-3-generic and gfx11-generic with commit
r15-4550-g1bdeebe69b71bf in October 2024 (undocumented). GCC itself
always supports all -march= targets, but a assembler supporting the arch
is required such that at user runtime and when building a multilib, a
assembler (and linker) supporting the new features is required. (GCC
uses LLVM' assembler (llvm-mc) and linker (lld), i.e. LLVM 19+ is
required for gfx*-generic.] However, the required runtime code landed in
ROCm much later; namely, commit 0c18ff22 rocr: Generic ISA targets
support (Oct 28, 2024) in https://github.com/ROCm/ROCR-Runtime It is
believed that the next ROCm release contained this feature, which is
ROCm 6.3, released on Dec 3, 2024. The latest ROCm is 6.3.2 of Jan 28,
2025. Still, adding gfx*generic increases the number of required
multilibs as it does not seem to be possible to link mixed code of
generic and specific GPU code. See
https://llvm.org/docs/AMDGPUUsage.html#amdgpu-generic-processor-table
for a list of gfx*generic and supported gfx* devices and some generic
restrictions due to using multilib. While gfx11-generic and
gfx10-3-generic include all GPUs of that generation, with no or few
restrictions, GFX9 devices are rather different and, hence, gfx9-generic
only covers a subset of the devices. * * * This patch now enables
support for gfx10-3-generic, gfx11-generic and (new!) gfx9-generic in
libgomp, making it actually usable. In libgomp, GCC prints its own
diagnostic if there is an ISA mismatch between the actual GPU and the
compiled-for GPU. Hence, not only ROCm but also GCC needs to know which
GPUs are compatible - in order to propose the
-foffload-options=-march=gfx... to compile for. That diagnostic now also
proposes to try compile for the specific gfx*generic besides compiling
for the specific GPU. Reasoning: As the number of multilibs is limited,
having only a gfx11-generic multilib, it makes sense to propose
-march=gfx11-generic besides, e.g., -march=gfx1103 especially when the
gfx1103 multilib is unavailable - and vice versa. In case GCC thinks
that the ISA is supported but (a too old) ROCm does not recognize it,
the error is now inferior; however, some wording has been added to the
generic error message, which might still help. As there are a couple of
GPUs, previously unsupported, that are supported by ROCm with the same
gfx*-generic as GPUs we support, it makes sense to add those GPUs as
well - both to handle them in libgomp's generic diagnostic and to
support them in general. Therefore, the following GPUs are now supported
in addition: gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034,
gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153.
However, the multilib config has not been touched, hence, those 14
device types and gfx{9,10-3,11}-generic are not supported by default.
Currently, the following 9 GPUs are enabled by default:gfx900, gfx906,
gfx908, gfx90a, gfx90c, gfx1030, gfx1036, gfx1100, andgfx1103.
* * *
For distros building with LLVM 19, I could imagine that adding the
gfx10-3-generic and gfx11-generic (and possibly gfx9-generic) multilibs
could make sense; whether gfx1030, gfx1036, gfx1100, andgfx1103 could already be
dropped - or only later (once ROCm 3.6 is more widely deployed) is the a good
question. [My gut feeling is that a distro should wait until next year, given
that December 2024 is still very recent.]
* * *
Thus back to the attached patch, which does:
* Add gfx9-generic - and enable libgomp support for gfx10-3-generic
* Addgfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035,
gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153. * Update the
install + invoke (-march=) documentation for it
The patch has loosely be tested - but I currently do not have a ROCm 6.3
available with a gfx*-generic supported device; hence, I don't know whether
it really works.
Thus, I would be happy if someone with a supported gfx{9,10-3,11}-generic
device - or a newly added non-generic gfx* could test whether it actually
works!
[I am about to get a ROCm 6.3.2 with a gfx906 device, possibly later also
for gfx900 and even later for gfx1100.]
Any comment, remark, suggestion?
OK for mainline, once someone has shown that any gfx*-generic actually works?
Tobias
PS: I tried hard to make no copy'n'paste error and get all 0x53 etc. of the
GPUs correct, but I wouldn't mind if someone could proof read it against the
AMD GPU documentation linked above.
[GCN] Handle generic ISA names in libgomp's plugin-gcn.c
For code that as been compiled, e.g., with -march=gfx11-generic, accept
it in plugin-gcn.c's isa_matches_agent if the agent's ISA has the the
compiled-for ISA as generic isa. This requires ROCm 6.3 on the runtime
side and LLVM 19 or later on the assembler side (llvm-mc).
Additionally, gfx9-generic is now supported. As libgomp checks for
GPU compatibility itself, it also should know about all supported GPUs -
and as the codegen is the same as for existing GPUs, the following GPUs
have been added in addition: gfx902, gfx904, gfx909, gfx1031, gfx1032,
gfx1033, gfx1034, gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152,
and gfx1153.
Note that the by-default enabled multilibs have not changed, i.e. only
gfx900, gfx906, gfx908, gfx90a, gfx90c, gfx1030, gfx1036, gfx1100, and
gfx1103 are enabled by default.
gcc/ChangeLog:
* config/gcn/gcn-devices.def: Add field to for the generic ISA
for the device.
* config/gcn/gcn.cc (GCN_DEVICE): Add missing ... to macro define.
* doc/install.texi (AMD GCN): Mention gfx*-generic requirements
and note that not all possible multilibs are enabled.
* doc/invoke.texi (AMD GCN): Update -march= for gfx*-generic and
add all missing gfx* compatible with a supported generic architecture.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (get_generic_isa_code): New.
(isa_matches_agent): Accept also generic ISA.
(create_and_finalize_hsa_program): Add diagnostic about the ISA
for HSA_STATUS_ERROR_INVALID_CODE_OBJECT, useful e.g. when ROCm
is too old.
gcc/config/gcn/gcn-devices.def | 202 ++++++++++++++++++++++++++++++++++++++---
gcc/config/gcn/gcn.cc | 3 +-
gcc/doc/install.texi | 8 +-
gcc/doc/invoke.texi | 53 +++++++++++
libgomp/plugin/plugin-gcn.c | 51 +++++++++--
5 files changed, 295 insertions(+), 22 deletions(-)
diff --git a/gcc/config/gcn/gcn-devices.def b/gcc/config/gcn/gcn-devices.def
index 7d47a7b495d..d90f086bc81 100644
--- a/gcc/config/gcn/gcn-devices.def
+++ b/gcc/config/gcn/gcn-devices.def
@@ -71,6 +71,10 @@
generated by the used llvm-mc assembler.
10 "Architecture Family Name" (string, external)
Used to #define '__GFX<...>__'.
+ 11 "EF_AMDGPU_MACH_AMDGCN_" ## "GENERIC NAME" (text, external)
+ The name of the generic ISA this device is compatible with; either
+ EF_AMDGPU_MACH_UNSUPPORTED or EF_AMDGPU_MACH_AMDGCN_ ## NAME, where
+ NAME is field (1) of the associated generic device.
Fields marked "external", above, have values defined elsewhere (HSA, ROCM,
LLVM, ELF, etc.) and must have matching definitions here. Fields marked
@@ -86,7 +90,30 @@ GCN_DEVICE(gfx900, GFX900, 0x2c, ISA_GCN5,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx902, GFX902, 0x2d, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_OFF,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx904, GFX904, 0x2e, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_OFF,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
)
GCN_DEVICE(gfx906, GFX906, 0x2f, ISA_GCN5,
@@ -96,7 +123,8 @@ GCN_DEVICE(gfx906, GFX906, 0x2f, ISA_GCN5,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
)
GCN_DEVICE(gfx908, GFX908, 0x30, ISA_CDNA1,
@@ -106,7 +134,19 @@ GCN_DEVICE(gfx908, GFX908, 0x30, ISA_CDNA1,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_UNSUPPORTED
+ )
+
+GCN_DEVICE(gfx909, GFX909, 0x31, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_ANY,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
)
GCN_DEVICE(gfx90a, GFX90A, 0x3f, ISA_CDNA2,
@@ -116,7 +156,8 @@ GCN_DEVICE(gfx90a, GFX90A, 0x3f, ISA_CDNA2,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 512,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_UNSUPPORTED
)
GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
@@ -126,7 +167,19 @@ GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_ANY,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 1,
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ EF_AMDGPU_MACH_UNSUPPORTED
)
/* GCN GFX10.3 (RDNA 2) */
@@ -138,7 +191,63 @@ GCN_DEVICE(gfx1030, GFX1030, 0x36, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1031, GFX1031, 0x37, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1032, GFX1032, 0x38, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1033, GFX1033, 0x39, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1034, GFX1034, 0x3e, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1035, GFX1035, 0x3d, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
)
GCN_DEVICE(gfx1036, GFX1036, 0x45, ISA_RDNA2,
@@ -148,7 +257,8 @@ GCN_DEVICE(gfx1036, GFX1036, 0x45, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX10_3_GENERIC
)
GCN_DEVICE(gfx10-3-generic, GFX10_3_GENERIC, 0x053, ISA_RDNA2,
@@ -158,7 +268,8 @@ GCN_DEVICE(gfx10-3-generic, GFX10_3_GENERIC, 0x053, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 1,
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ EF_AMDGPU_MACH_UNSUPPORTED
)
/* GCN GFX11 (RDNA 3) */
@@ -170,7 +281,30 @@ GCN_DEVICE(gfx1100, GFX1100, 0x41, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536, /* 1536 SIMD32 = 768 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1101, GFX1101, 0x46, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1102, GFX1102, 0x47, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
)
GCN_DEVICE(gfx1103, GFX1103, 0x44, ISA_RDNA3,
@@ -180,7 +314,52 @@ GCN_DEVICE(gfx1103, GFX1103, 0x44, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1150, GFX1150, 0x43, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1151, GFX1151, 0x4a, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1152, GFX1152, 0x55, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1153, GFX1153, 0x58, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_AMDGCN_GFX11_GENERIC
)
GCN_DEVICE(gfx11-generic, GFX11_GENERIC, 0x054, ISA_RDNA3,
@@ -190,7 +369,8 @@ GCN_DEVICE(gfx11-generic, GFX11_GENERIC, 0x054, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536,
/* Generic code obj version */ 1,
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ EF_AMDGPU_MACH_UNSUPPORTED
)
#undef GCN_DEVICE
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 4200cfaf006..82fc6ff1e41 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -101,7 +101,8 @@ static hash_map<tree, int> lds_allocs;
/* Import all the data from gcn-devices.def.
The PROCESSOR_GFXnnn should be indices for this table. */
const struct gcn_device_def gcn_devices[] = {
-#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, GEN_VER,ARCH_FAM) \
+#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, \
+ GEN_VER, ARCH_FAM, ...) \
{PROCESSOR_ ## NAME, #name, #NAME, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, \
GEN_VER, #ARCH_FAM},
#include "gcn-devices.def"
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 3b9f56b0529..4a6bba016a3 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3998,7 +3998,13 @@ Instead of GNU Binutils, you will need to install LLVM 15, or later, and copy
@file{bin/llvm-ar} to both @file{bin/amdgcn-amdhsa-ar} and
@file{bin/amdgcn-amdhsa-ranlib}. Note that LLVM 13.0.1 or LLVM 14 can be used
by specifying a @code{--with-multilib-list=} that does not list @code{gfx1100}
-and @code{gfx1103}.
+and @code{gfx1103}. LLVM 19, or later, is required for the generic
+@code{gfx9-generic}, @code{gfx10-3-generic}, and @code{gfx11-generic} targets;
+note that linking of generic and non-generic code is not supported and that
+at least ROCm 6.3 is required for generic architectures. As the list of ISA
+is long and linking requires an exact match, GCC by default only builds a
+subset of the supported ISA as multilib; use @code{--with-multilib-list=} to
+tailor the built multilibs.
Use Newlib (4.3.0 or newer; 4.4.0 contains some improvements and 4.5.0 fixes
the device console output for GFX10 and GFX11 devices).
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9050ffa59dd..26664fb978b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -22289,30 +22289,83 @@ are
@item gfx900
Compile for GCN5 Vega 10 devices (gfx900).
+@item gfx902
+Compile for GCN5 Vega gfx902 devices.
+
+@item gfx904
+Compile for GCN5 Vega gfx904 devices.
+
@item gfx906
Compile for GCN5 Vega 20 devices (gfx906).
@item gfx908
Compile for CDNA1 Instinct MI100 series devices (gfx908).
+@item gfx909
+Compile for GCN5 Vega gfx909 devices.
+
@item gfx90a
Compile for CDNA2 Instinct MI200 series devices (gfx90a).
@item gfx90c
Compile for GCN5 Vega 7 devices (gfx90c).
+@item gfx9-generic
+Compile generic code for Vega devices, executable on the following subset of
+GFX9 devices: gfx900, gfx902, gfx904, gfx906, gfx909 and gfx90c.
+
@item gfx1030
Compile for RDNA2 gfx1030 devices (GFX10 series).
+@item gfx1031
+Compile for RDNA2 gfx1031 devices (GFX10 series).
+
+@item gfx1032
+Compile for RDNA2 gfx1032 devices (GFX10 series).
+
+@item gfx1033
+Compile for RDNA2 gfx1033 devices (GFX10 series).
+
+@item gfx1034
+Compile for RDNA2 gfx1034 devices (GFX10 series).
+
+@item gfx1035
+Compile for RDNA2 gfx1035 devices (GFX10 series).
+
@item gfx1036
Compile for RDNA2 gfx1036 devices (GFX10 series).
+@item gfx10-3-generic
+Compile generic code for GFX10-3 devices, executable on gfx1030,
+gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, and gfx1036.
+
@item gfx1100
Compile for RDNA3 gfx1100 devices (GFX11 series).
+@item gfx1101
+Compile for RDNA3 gfx1101 devices (GFX11 series).
+
+@item gfx1102
+Compile for RDNA3 gfx1102 devices (GFX11 series).
+
@item gfx1103
Compile for RDNA3 gfx1103 devices (GFX11 series).
+@item gfx1150
+Compile for RDNA3 gfx1150 devices (GFX11 series).
+
+@item gfx1151
+Compile for RDNA3 gfx1151 devices (GFX11 series).
+
+@item gfx1152
+Compile for RDNA3 gfx1152 devices (GFX11 series).
+
+@item gfx1153
+Compile for RDNA3 gfx1153 devices (GFX11 series).
+
+@item gfx11-generic
+Compile generic code for GFX11 devices, executable on gfx1100, gfx1101,
+gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, and gfx1153.
@end table
@opindex msram-ecc
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 8015a6f80f3..3d04eef2dd0 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1694,6 +1694,22 @@ isa_code(const char *isa) {
return EF_AMDGPU_MACH_UNSUPPORTED;
}
+/* To any ISA code, return the associated generic code, if any. Code compiled
+ with the GENERIC ISA code is compatible with ISA. */
+
+static gcn_isa
+get_generic_isa_code (const gcn_isa isa)
+{
+ switch (isa)
+ {
+#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAM, WAVE64, CU, \
+ MAX_ISA_VGPRS, GEN_VER, ARCH_FAM, GEN_MACH, ...) \
+ case EF_AMDGPU_MACH_AMDGCN_ ## NAME: return GEN_MACH;
+#include "../../gcc/config/gcn/gcn-devices.def"
+ default: return EF_AMDGPU_MACH_UNSUPPORTED;
+ }
+}
+
/* CDNA2 devices have twice as many VGPRs compared to older devices. */
static int
@@ -2412,19 +2428,31 @@ isa_matches_agent (struct agent_info *agent, Elf64_Ehdr *image)
return false;
}
- if (isa_field != agent->device_isa)
+ gcn_isa generic_isa = get_generic_isa_code (agent->device_isa);
+ if (isa_field != agent->device_isa && isa_field != generic_isa)
{
- char msg[204];
+ char msg[265];
const char *agent_isa_s = isa_name (agent->device_isa);
assert (agent_isa_s);
- snprintf (msg, sizeof msg,
- "GCN code object ISA '%s' does not match GPU ISA '%s' "
- "(device %d).\n"
- "Try to recompile with '-foffload-options=-march=%s',\n"
- "or use ROCR_VISIBLE_DEVICES to disable incompatible "
- "devices.\n",
- isa_s, agent_isa_s, agent->device_id, agent_isa_s);
+ if (generic_isa != EF_AMDGPU_MACH_UNSUPPORTED)
+ snprintf (msg, sizeof msg,
+ "GCN code object ISA '%s' does not match GPU ISA '%s' "
+ "(device %d).\n"
+ "Try to recompile with '-foffload-options=-march=%s' or "
+ "-foffload-options=-march=%s\n"
+ "or use ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices.\n",
+ isa_s, agent_isa_s, agent->device_id, agent_isa_s,
+ isa_name (generic_isa));
+ else
+ snprintf (msg, sizeof msg,
+ "GCN code object ISA '%s' does not match GPU ISA '%s' "
+ "(device %d).\n"
+ "Try to recompile with '-foffload-options=-march=%s',\n"
+ "or use ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices.\n",
+ isa_s, agent_isa_s, agent->device_id, agent_isa_s);
hsa_error (msg, HSA_STATUS_ERROR);
return false;
@@ -2482,6 +2510,11 @@ create_and_finalize_hsa_program (struct agent_info *agent)
(agent->executable, agent->id, co, "");
if (status != HSA_STATUS_SUCCESS)
{
+ if (status == HSA_STATUS_ERROR_INVALID_CODE_OBJECT)
+ GOMP_PLUGIN_error ("GCN code object ISA '%s' running on GPU ISA "
+ "'%s' (device %d)",
+ isa_name (elf_gcn_isa_field (image)),
+ isa_name (agent->device_isa), agent->device_id);
hsa_error ("Could not load GCN code object", status);
goto fail;
}