Updated version; changes:
* Fixed SRAMECC setting for gfx906 - should be ANY (as in GCC 14)
not UNSUPPORTED. (Shows up as link error when compiling with '-g'.)
* Fixed generic handling for gfx9* - copy'n'paste bug: gfx11-generic
is clearly wrong.
* Implemented autoconvert from -march=gfx<specific> to -march=gfx*-generic
if the latter but not the former has configured run-time libraries.
(with a warning).
* Largely improved the runtime diagnostic.
* It also uses non-deprecated functions for the loader - in the hope
that it would help, but it makes no difference. As the hsa.h header
shipping with GCC is quite old, I think changing it should be fine
as I doubt that there is any libhsa*.so in use that doesn't support
it. - On the other hand, as it doesn't solve a known issue, we
don't have to update it.
* * *
Thus, it should work in principle. - BUT:
I have now a ROCm 6.3.2 available - and it DOES NOT work.
ROCm 6.3.2 supports gfx*generic as a grep shows.
Trying a hello-world program with -march=gfx9-generic and running it with
LOADER_ENABLE_LOGGING=1 set, it fails with: LoaderError: code object's
ISA (amdgcn-amd-amdhsa--gfx9-generic) is invalid and returns
HSA_STATUS_ERROR_INVALID_ISA_NAME.Looking at
https://github.com/ROCm/ROCR-Runtime I have *no* idea why it fails - the
name "amd-amdhsa--gfx9-generic"
should be in the map as generic name, there is nothing suspicious
with the lookup - and the fail is before there is any compatibility
check regarding the actual hardware, which is the next check.
'gfx11-generic' is a bit longer supported and fails identically.
(The hardware is gfx906, but that shouldn't matter.)
As I assume that someone has tested it, some variant should work.
The question is only what. The library as too many indirections
but all code is rather simple and I fail to see why it wouldn't
work. (Some code is not obvious but as gfx906 works, those
bits are probably fine.)
* * *
Thus, if anyone has an idea how to make progress from here ... BTW: The
gfx906 SRAMECC fix should be committed separately and soon - before we
forget it and before GCC 15 ships. Tobias
[GCN] Handle generic ISA names in libgomp's plugin-gcn.c
There are plenty of AMD GPUs, each having its own ISA; to reduce the number
of compile-for ISA, some of them were bundled under a generic name, supporting
either all features or a large subset of it. Initial support for this was
added in r15-4550-g1bdeebe69b71bf. On the assembler/linker side, it requires
LLVM 19 - and on the runtime side, ROCm 6.3.x (released in December 2024).
This commit adds gfx9-generic alongside gfx10-3-generic and gfx11-generic
and all previously unsupported specific devices supported by either of
them, i.e. gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034,
gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153.
Those are marked as 'experimental' as they have not been tested and
also because not all features of, e.g., gfx115x are supported - only
those of the common subset provided as by gfx11-generic.
While gfx10-3-generic and gfx11-generic cover all GPUs, gfx9-generic
only covers a small subset - and, in particular, the compute MI100
and MI200 devices are not included.
On the runtime side, the assumption that all generic code will work
and only when the ROCm (actually: rocr) fails to load, a specific
error message is printed, providing some suggestions how to fix a
fail.
On the compiler side: In order to make it easier to ship only the
generic libraries, the compiler will fallback to the generic version
if a (multi)lib exists for it but not for the specific version.
gcc/ChangeLog:
* config/gcn/gcn-devices.def (GCN_DEVICE): Add field to for
the generic ISA for the device.
Fix sramecc setting of gfx906 - should be ANY not UNSUPPORTED.
* config/gcn/gcn-tables.opt: Regenerate.
* config/gcn/gcn.cc (gcn_devices): Add missing tailing ... to
GCN_DEVICE macro #define.
* config/gcn/mkoffload.cc (enum elf_arch_code): Add
EF_AMDGPU_MACH_AMDGCN_NONE.
(elf_arch): Use enum elf_arch_code as type.
(tool_cleanup): Silence warning by removing tailing '.' from error.
(get_arch_name): Return enum elf_arch_code.
(elf_arch_generic_update): New; replace specific device by
generic device if only the latter has a multilib.
(main): Call it; replace -march= as needed.
* doc/install.texi (AMD GCN): Document gfx*-generic handling
and dependencies.
* doc/invoke.texi (AMD GCN): Update -march= for
gfx{9,10-3,11}-generic and added compatible gfx* devices.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add EF_AMDGPU_MACH_AMDGCN_NONE.
(EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET,
GET_GENERIC_VERSION): Define.
(elf_gcn_isa_is_generic, get_generic_isa_code): New.
(isa_matches_agent): Add 'failed' flag, if false, accept all generic
if true, print extended diagnostic.
(create_and_finalize_hsa_program): Call it also for
ERROR_INVALID_CODE_OBJECT and output more diagnostic if the error is
different.
(struct hsa_runtime_fn_info, init_hsa_runtime_functions,
create_and_finalize_hsa_program): Replace deprecated
hsa_executable_load_code_object by
hsa_executable_load_agent_code_object and
hsa_code_object_deserialize bys
hsa_code_object_reader_create_from_memory and
hsa_code_object_reader_destroy.
gcc/config/gcn/gcn-devices.def | 204 ++++++++++++++++++++++++++++++++++++++---
gcc/config/gcn/gcn-tables.opt | 45 +++++++++
gcc/config/gcn/gcn.cc | 3 +-
gcc/config/gcn/mkoffload.cc | 115 +++++++++++++++++++++--
gcc/doc/install.texi | 10 +-
gcc/doc/invoke.texi | 53 +++++++++++
libgomp/plugin/plugin-gcn.c | 166 ++++++++++++++++++++++++---------
7 files changed, 528 insertions(+), 68 deletions(-)
diff --git a/gcc/config/gcn/gcn-devices.def b/gcc/config/gcn/gcn-devices.def
index 7d47a7b495d..af1420382e2 100644
--- a/gcc/config/gcn/gcn-devices.def
+++ b/gcc/config/gcn/gcn-devices.def
@@ -71,6 +71,10 @@
generated by the used llvm-mc assembler.
10 "Architecture Family Name" (string, external)
Used to #define '__GFX<...>__'.
+ 11 "GENERIC NAME" (text, external)
+ The name of the generic ISA this device is compatible with or "NONE",
+ where the generic name is the NAME (field 2) of the associated
+ generic device.
Fields marked "external", above, have values defined elsewhere (HSA, ROCM,
LLVM, ELF, etc.) and must have matching definitions here. Fields marked
@@ -86,17 +90,41 @@ GCN_DEVICE(gfx900, GFX900, 0x2c, ISA_GCN5,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
)
-GCN_DEVICE(gfx906, GFX906, 0x2f, ISA_GCN5,
+GCN_DEVICE(gfx902, GFX902, 0x2d, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_OFF,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
+ )
+
+GCN_DEVICE(gfx904, GFX904, 0x2e, ISA_GCN5,
/* XNACK default */ HSACO_ATTR_OFF,
/* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
/* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
+ )
+
+GCN_DEVICE(gfx906, GFX906, 0x2f, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_OFF,
+ /* SRAM_ECC default */ HSACO_ATTR_ANY,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
)
GCN_DEVICE(gfx908, GFX908, 0x30, ISA_CDNA1,
@@ -106,7 +134,19 @@ GCN_DEVICE(gfx908, GFX908, 0x30, ISA_CDNA1,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ NONE
+ )
+
+GCN_DEVICE(gfx909, GFX909, 0x31, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_ANY,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
)
GCN_DEVICE(gfx90a, GFX90A, 0x3f, ISA_CDNA2,
@@ -116,7 +156,8 @@ GCN_DEVICE(gfx90a, GFX90A, 0x3f, ISA_CDNA2,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 512,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ NONE
)
GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
@@ -126,7 +167,19 @@ GCN_DEVICE(gfx90c, GFX90C, 0x32, ISA_GCN5,
/* CU mode */ HSACO_ATTR_UNSUPPORTED,
/* Max ISA VGPRs */ 256,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX9
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ GFX9_GENERIC
+ )
+
+GCN_DEVICE(gfx9-generic, GFX9_GENERIC, 0x051, ISA_GCN5,
+ /* XNACK default */ HSACO_ATTR_ANY,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_UNSUPPORTED,
+ /* CU mode */ HSACO_ATTR_UNSUPPORTED,
+ /* Max ISA VGPRs */ 256,
+ /* Generic code obj version */ 1,
+ /* Architecture Family */ GFX9,
+ /* Generic Name */ NONE
)
/* GCN GFX10.3 (RDNA 2) */
@@ -138,7 +191,63 @@ GCN_DEVICE(gfx1030, GFX1030, 0x36, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1031, GFX1031, 0x37, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1032, GFX1032, 0x38, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1033, GFX1033, 0x39, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1034, GFX1034, 0x3e, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
+ )
+
+GCN_DEVICE(gfx1035, GFX1035, 0x3d, ISA_RDNA2,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
)
GCN_DEVICE(gfx1036, GFX1036, 0x45, ISA_RDNA2,
@@ -148,7 +257,8 @@ GCN_DEVICE(gfx1036, GFX1036, 0x45, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ GFX10_3_GENERIC
)
GCN_DEVICE(gfx10-3-generic, GFX10_3_GENERIC, 0x053, ISA_RDNA2,
@@ -158,7 +268,8 @@ GCN_DEVICE(gfx10-3-generic, GFX10_3_GENERIC, 0x053, ISA_RDNA2,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
/* Generic code obj version */ 1,
- /* Architecture Family */ GFX10
+ /* Architecture Family */ GFX10,
+ /* Generic Name */ NONE
)
/* GCN GFX11 (RDNA 3) */
@@ -170,7 +281,30 @@ GCN_DEVICE(gfx1100, GFX1100, 0x41, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536, /* 1536 SIMD32 = 768 wavefrontsize64. */
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1101, GFX1101, 0x46, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1102, GFX1102, 0x47, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
)
GCN_DEVICE(gfx1103, GFX1103, 0x44, ISA_RDNA3,
@@ -180,7 +314,52 @@ GCN_DEVICE(gfx1103, GFX1103, 0x44, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536,
/* Generic code obj version */ 0, /* non-generic */
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1150, GFX1150, 0x43, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1151, GFX1151, 0x4a, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1152, GFX1152, 0x55, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
+ )
+
+GCN_DEVICE(gfx1153, GFX1153, 0x58, ISA_RDNA3,
+ /* XNACK default */ HSACO_ATTR_UNSUPPORTED,
+ /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
+ /* WAVE64 mode */ HSACO_ATTR_ON,
+ /* CU mode */ HSACO_ATTR_ON,
+ /* Max ISA VGPRs */ 1536,
+ /* Generic code obj version */ 0, /* non-generic */
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ GFX11_GENERIC
)
GCN_DEVICE(gfx11-generic, GFX11_GENERIC, 0x054, ISA_RDNA3,
@@ -190,7 +369,8 @@ GCN_DEVICE(gfx11-generic, GFX11_GENERIC, 0x054, ISA_RDNA3,
/* CU mode */ HSACO_ATTR_ON,
/* Max ISA VGPRs */ 1536,
/* Generic code obj version */ 1,
- /* Architecture Family */ GFX11
+ /* Architecture Family */ GFX11,
+ /* Generic Name */ NONE
)
#undef GCN_DEVICE
diff --git a/gcc/config/gcn/gcn-tables.opt b/gcc/config/gcn/gcn-tables.opt
index be21af425e7..96ce9bd2df3 100644
--- a/gcc/config/gcn/gcn-tables.opt
+++ b/gcc/config/gcn/gcn-tables.opt
@@ -27,21 +27,48 @@ GCN GPU type to use:
EnumValue
Enum(gpu_type) String(gfx900) Value(PROCESSOR_GFX900)
+EnumValue
+Enum(gpu_type) String(gfx902) Value(PROCESSOR_GFX902)
+
+EnumValue
+Enum(gpu_type) String(gfx904) Value(PROCESSOR_GFX904)
+
EnumValue
Enum(gpu_type) String(gfx906) Value(PROCESSOR_GFX906)
EnumValue
Enum(gpu_type) String(gfx908) Value(PROCESSOR_GFX908)
+EnumValue
+Enum(gpu_type) String(gfx909) Value(PROCESSOR_GFX909)
+
EnumValue
Enum(gpu_type) String(gfx90a) Value(PROCESSOR_GFX90A)
EnumValue
Enum(gpu_type) String(gfx90c) Value(PROCESSOR_GFX90C)
+EnumValue
+Enum(gpu_type) String(gfx9-generic) Value(PROCESSOR_GFX9_GENERIC)
+
EnumValue
Enum(gpu_type) String(gfx1030) Value(PROCESSOR_GFX1030)
+EnumValue
+Enum(gpu_type) String(gfx1031) Value(PROCESSOR_GFX1031)
+
+EnumValue
+Enum(gpu_type) String(gfx1032) Value(PROCESSOR_GFX1032)
+
+EnumValue
+Enum(gpu_type) String(gfx1033) Value(PROCESSOR_GFX1033)
+
+EnumValue
+Enum(gpu_type) String(gfx1034) Value(PROCESSOR_GFX1034)
+
+EnumValue
+Enum(gpu_type) String(gfx1035) Value(PROCESSOR_GFX1035)
+
EnumValue
Enum(gpu_type) String(gfx1036) Value(PROCESSOR_GFX1036)
@@ -51,8 +78,26 @@ Enum(gpu_type) String(gfx10-3-generic) Value(PROCESSOR_GFX10_3_GENERIC)
EnumValue
Enum(gpu_type) String(gfx1100) Value(PROCESSOR_GFX1100)
+EnumValue
+Enum(gpu_type) String(gfx1101) Value(PROCESSOR_GFX1101)
+
+EnumValue
+Enum(gpu_type) String(gfx1102) Value(PROCESSOR_GFX1102)
+
EnumValue
Enum(gpu_type) String(gfx1103) Value(PROCESSOR_GFX1103)
+EnumValue
+Enum(gpu_type) String(gfx1150) Value(PROCESSOR_GFX1150)
+
+EnumValue
+Enum(gpu_type) String(gfx1151) Value(PROCESSOR_GFX1151)
+
+EnumValue
+Enum(gpu_type) String(gfx1152) Value(PROCESSOR_GFX1152)
+
+EnumValue
+Enum(gpu_type) String(gfx1153) Value(PROCESSOR_GFX1153)
+
EnumValue
Enum(gpu_type) String(gfx11-generic) Value(PROCESSOR_GFX11_GENERIC)
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 4200cfaf006..82fc6ff1e41 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -101,7 +101,8 @@ static hash_map<tree, int> lds_allocs;
/* Import all the data from gcn-devices.def.
The PROCESSOR_GFXnnn should be indices for this table. */
const struct gcn_device_def gcn_devices[] = {
-#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, GEN_VER,ARCH_FAM) \
+#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, \
+ GEN_VER, ARCH_FAM, ...) \
{PROCESSOR_ ## NAME, #name, #NAME, ISA, XNACK, SRAMECC, WAVE64, CU, VGPRS, \
GEN_VER, #ARCH_FAM},
#include "gcn-devices.def"
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index 92e8fe70c12..39be6630fd0 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -53,6 +53,7 @@
/* Extract the EF_AMDGPU_MACH_AMDGCN_GFXnnn from the def file. */
enum elf_arch_code {
+ EF_AMDGPU_MACH_AMDGCN_NONE = -1, /* For generic handling. */
#define GCN_DEVICE(name, NAME, ELF_ARCH, ...) \
EF_AMDGPU_MACH_AMDGCN_ ## NAME = ELF_ARCH,
#include "gcn-devices.def"
@@ -135,9 +136,8 @@ static struct obstack files_to_cleanup;
enum offload_abi offload_abi = OFFLOAD_ABI_UNSET;
const char *offload_abi_host_opts = NULL;
-uint32_t elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX900; // Default GPU architecture.
+enum elf_arch_code elf_arch = EF_AMDGPU_MACH_AMDGCN_GFX900; // Default GPU architecture.
uint32_t elf_flags = EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4;
-
static int gcn_stack_size = 0; /* Zero means use default. */
/* Delete tempfiles. */
@@ -782,7 +782,7 @@ compile_native (const char *infile, const char *outfile, const char *compiler,
obstack_ptr_grow (&argv_obstack, ".c");
if (!offload_abi_host_opts)
fatal_error (input_location,
- "%<-foffload-abi-host-opts%> not specified.");
+ "%<-foffload-abi-host-opts%> not specified");
obstack_ptr_grow (&argv_obstack, offload_abi_host_opts);
obstack_ptr_grow (&argv_obstack, infile);
obstack_ptr_grow (&argv_obstack, "-c");
@@ -796,16 +796,15 @@ compile_native (const char *infile, const char *outfile, const char *compiler,
obstack_free (&argv_obstack, NULL);
}
-static int
+static enum elf_arch_code
get_arch (const char *str, const char *with_arch_str)
{
/* Use the def file to map the name to the elf_arch_code. */
if (!str) ;
#define GCN_DEVICE(name, NAME, ELF, ...) \
else if (strcmp (str, #name) == 0) \
- return ELF;
+ return (enum elf_arch_code) ELF;
#include "gcn-devices.def"
-#undef GCN_DEVICE
/* else */
error ("unrecognized argument in option %<-march=%s%>", str);
@@ -839,7 +838,92 @@ get_arch (const char *str, const char *with_arch_str)
exit (FATAL_EXIT_CODE);
- return 0;
+ return EF_AMDGPU_MACH_AMDGCN_NONE;
+}
+
+static const char*
+get_arch_name (enum elf_arch_code arch_code)
+{
+ switch (arch_code)
+ {
+#define GCN_DEVICE(name, NAME, ELF, ...) \
+ case EF_AMDGPU_MACH_AMDGCN_ ## NAME: \
+ return #name;
+#include "../../gcc/config/gcn/gcn-devices.def"
+ default: return NULL;
+ }
+}
+
+/* If an generic arch exists and for the chosen arch no (multi)lib is
+ available, default to the generic version, if that has a (multi)lib
+ configured for. */
+
+static enum elf_arch_code
+elf_arch_generic_update (enum elf_arch_code elf_arch,
+ enum elf_arch_code default_arch)
+{
+ enum elf_arch_code generic_arch;
+ switch (elf_arch)
+ {
+#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAM, WAVE64, CU, \
+ MAX_ISA_VGPRS, GEN_VER, ARCH_FAM, GEN_MACH, ...) \
+ case EF_AMDGPU_MACH_AMDGCN_ ## NAME: \
+ generic_arch = EF_AMDGPU_MACH_AMDGCN_ ## GEN_MACH; break;
+#include "../../gcc/config/gcn/gcn-devices.def"
+ default: generic_arch = EF_AMDGPU_MACH_AMDGCN_NONE;
+ }
+
+ /* If not generic or the default arch, the library version exists. */
+ if (generic_arch == EF_AMDGPU_MACH_AMDGCN_NONE || elf_arch == default_arch)
+ return elf_arch;
+
+ /* Search gcn_arch in the multilib config, which might look like
+ "march=gfx900/march=gfx906". */
+ const char *p = multilib_options;
+ const char *q = NULL;
+ const char *isa_name = get_arch_name (elf_arch);
+ while ((q = strstr (p, isa_name)) != NULL)
+ {
+ if (multilib_options + strlen ("march=") <= q
+ && startswith (&q[-strlen ("march=")], "march="))
+ {
+ const char r = q[strlen (isa_name)];
+ if (r != '\0' && r != '/')
+ continue;
+ break;
+ }
+ p++;
+ }
+
+ /* Specified -march= exists in the multilib. */
+ if (q != NULL)
+ return elf_arch;
+
+ /* If no lib, try to find one for the generic arch. */
+ const char *gen_name = get_arch_name (generic_arch);
+ if (generic_arch != default_arch)
+ {
+ p = multilib_options;
+ while ((q = strstr (p, gen_name)) != NULL)
+ {
+ if (multilib_options + strlen ("march=") <= q
+ && startswith (&q[-strlen ("march=")], "march="))
+ {
+ const char r = q[strlen (gen_name)];
+ if (r != '\0' && r != '/')
+ continue;
+ break;
+ }
+ p++;
+ }
+ if (q == NULL)
+ return elf_arch;
+ }
+ warning (0,
+ "using associated generic architecture %<-march=%s%> instead of "
+ "%<-march=%s%> as GCC was only built with a multilib for the former",
+ gen_name, isa_name);
+ return generic_arch;
}
int
@@ -864,6 +948,7 @@ main (int argc, char **argv)
elf_arch = get_arch (configure_default_options[0].value, NULL);
break;
}
+ enum elf_arch_code default_arch = elf_arch;
obstack_init (&files_to_cleanup);
if (atexit (mkoffload_cleanup) != 0)
@@ -998,6 +1083,9 @@ main (int argc, char **argv)
}
}
+ enum elf_arch_code orig_arch = elf_arch;
+ elf_arch = elf_arch_generic_update (elf_arch, default_arch);
+
if (!(fopenacc ^ fopenmp))
fatal_error (input_location,
"either %<-fopenacc%> or %<-fopenmp%> must be set");
@@ -1056,6 +1144,7 @@ main (int argc, char **argv)
case ELF: if (GEN_VER) SET_GENERIC_VERSION (elf_flags, GEN_VER); break;
#include "gcn-devices.def"
#undef GCN_DEVICE
+ case EF_AMDGPU_MACH_AMDGCN_NONE: gcc_unreachable ();
}
/* Build arguments for compiler pass. */
@@ -1072,12 +1161,15 @@ main (int argc, char **argv)
obstack_ptr_grow (&cc_argv_obstack, "-xlto");
if (fopenmp)
obstack_ptr_grow (&cc_argv_obstack, "-mgomp");
-
+ if (orig_arch != elf_arch)
+ obstack_ptr_grow (&cc_argv_obstack,
+ concat ("-march=", get_arch_name (elf_arch), NULL));
for (int ix = 1; ix != argc; ix++)
{
if (!strcmp (argv[ix], "-o") && ix + 1 != argc)
outname = argv[++ix];
- else
+ else if (orig_arch == elf_arch
+ || !startswith (argv[ix], "-march"))
obstack_ptr_grow (&cc_argv_obstack, argv[ix]);
}
@@ -1195,9 +1287,12 @@ main (int argc, char **argv)
for (int i = 1; i < argc; i++)
if (startswith (argv[i], "-l")
|| startswith (argv[i], "-Wl")
- || startswith (argv[i], "-march"))
+ || (orig_arch == elf_arch && startswith (argv[i], "-march")))
obstack_ptr_grow (&ld_argv_obstack, argv[i]);
+ if (orig_arch != elf_arch)
+ obstack_ptr_grow (&ld_argv_obstack,
+ concat ("-march=", get_arch_name (elf_arch), NULL));
obstack_ptr_grow (&cc_argv_obstack, "-dumpdir");
obstack_ptr_grow (&cc_argv_obstack, "");
obstack_ptr_grow (&cc_argv_obstack, "-dumpbase");
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 3b9f56b0529..8e1b6f445ff 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3998,7 +3998,15 @@ Instead of GNU Binutils, you will need to install LLVM 15, or later, and copy
@file{bin/llvm-ar} to both @file{bin/amdgcn-amdhsa-ar} and
@file{bin/amdgcn-amdhsa-ranlib}. Note that LLVM 13.0.1 or LLVM 14 can be used
by specifying a @code{--with-multilib-list=} that does not list @code{gfx1100}
-and @code{gfx1103}.
+and @code{gfx1103}. LLVM 19, or later, is required for the generic
+@code{gfx9-generic}, @code{gfx10-3-generic}, and @code{gfx11-generic} targets;
+note that linking of generic and non-generic code is not supported and that
+at least ROCm 6.3 is required for generic architectures. As the list of ISA
+is long and linking requires an exact match, GCC by default only builds a
+subset of the supported ISA as multilib; use @code{--with-multilib-list=} to
+tailor the built multilibs. In the installed compiler, if a multilib is only
+available for a generic architecture and not for the specific one, the
+compiler will automatically compile the generic one instead.
Use Newlib (4.3.0 or newer; 4.4.0 contains some improvements and 4.5.0 fixes
the device console output for GFX10 and GFX11 devices).
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9050ffa59dd..2c133e2a947 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -22289,30 +22289,83 @@ are
@item gfx900
Compile for GCN5 Vega 10 devices (gfx900).
+@item gfx902
+Compile for GCN5 Vega gfx902 devices. (Experimental)
+
+@item gfx904
+Compile for GCN5 Vega gfx904 devices. (Experimental)
+
@item gfx906
Compile for GCN5 Vega 20 devices (gfx906).
@item gfx908
Compile for CDNA1 Instinct MI100 series devices (gfx908).
+@item gfx909
+Compile for GCN5 Vega gfx909 devices. (Experimental)
+
@item gfx90a
Compile for CDNA2 Instinct MI200 series devices (gfx90a).
@item gfx90c
Compile for GCN5 Vega 7 devices (gfx90c).
+@item gfx9-generic
+Compile generic code for Vega devices, executable on the following subset of
+GFX9 devices: gfx900, gfx902, gfx904, gfx906, gfx909 and gfx90c.
+
@item gfx1030
Compile for RDNA2 gfx1030 devices (GFX10 series).
+@item gfx1031
+Compile for RDNA2 gfx1031 devices (GFX10 series). (Experimental)
+
+@item gfx1032
+Compile for RDNA2 gfx1032 devices (GFX10 series). (Experimental)
+
+@item gfx1033
+Compile for RDNA2 gfx1033 devices (GFX10 series). (Experimental)
+
+@item gfx1034
+Compile for RDNA2 gfx1034 devices (GFX10 series). (Experimental)
+
+@item gfx1035
+Compile for RDNA2 gfx1035 devices (GFX10 series). (Experimental)
+
@item gfx1036
Compile for RDNA2 gfx1036 devices (GFX10 series).
+@item gfx10-3-generic
+Compile generic code for GFX10-3 devices, executable on gfx1030,
+gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, and gfx1036.
+
@item gfx1100
Compile for RDNA3 gfx1100 devices (GFX11 series).
+@item gfx1101
+Compile for RDNA3 gfx1101 devices (GFX11 series). (Experimental)
+
+@item gfx1102
+Compile for RDNA3 gfx1102 devices (GFX11 series). (Experimental)
+
@item gfx1103
Compile for RDNA3 gfx1103 devices (GFX11 series).
+@item gfx1150
+Compile for RDNA3 gfx1150 devices (GFX11 series). (Experimental)
+
+@item gfx1151
+Compile for RDNA3 gfx1151 devices (GFX11 series). (Experimental)
+
+@item gfx1152
+Compile for RDNA3 gfx1152 devices (GFX11 series). (Experimental)
+
+@item gfx1153
+Compile for RDNA3 gfx1153 devices (GFX11 series). (Experimental)
+
+@item gfx11-generic
+Compile generic code for GFX11 devices, executable on gfx1100, gfx1101,
+gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, and gfx1153.
@end table
@opindex msram-ecc
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 8015a6f80f3..07ade60d171 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -151,9 +151,10 @@ struct hsa_runtime_fn_info
const char *options, hsa_executable_t *executable);
hsa_status_t (*hsa_executable_global_variable_define_fn)
(hsa_executable_t executable, const char *variable_name, void *address);
- hsa_status_t (*hsa_executable_load_code_object_fn)
+ hsa_status_t (*hsa_executable_load_agent_code_object_fn)
(hsa_executable_t executable, hsa_agent_t agent,
- hsa_code_object_t code_object, const char *options);
+ hsa_code_object_reader_t reader, const char *options,
+ hsa_loaded_code_object_t *code_object);
hsa_status_t (*hsa_executable_freeze_fn)(hsa_executable_t executable,
const char *options);
hsa_status_t (*hsa_signal_create_fn) (hsa_signal_value_t initial_value,
@@ -193,9 +194,10 @@ struct hsa_runtime_fn_info
hsa_signal_value_t (*hsa_signal_load_acquire_fn) (hsa_signal_t signal);
hsa_status_t (*hsa_queue_destroy_fn) (hsa_queue_t *queue);
- hsa_status_t (*hsa_code_object_deserialize_fn)
- (void *serialized_code_object, size_t serialized_code_object_size,
- const char *options, hsa_code_object_t *code_object);
+ hsa_status_t (*hsa_code_object_reader_create_from_memory_fn)
+ (const void *code_object, size_t size, hsa_code_object_reader_t *reader);
+ hsa_status_t (*hsa_code_object_reader_destroy_fn)
+ (hsa_code_object_reader_t reader);
hsa_status_t (*hsa_amd_memory_lock_fn)
(void *host_ptr, size_t size, hsa_agent_t *agents, int num_agent,
void **agent_ptr);
@@ -385,6 +387,7 @@ struct gcn_image_desc
typedef enum {
EF_AMDGPU_MACH_UNSUPPORTED = -1,
+ EF_AMDGPU_MACH_AMDGCN_NONE = -1, /* Used for generic arch. */
#define GCN_DEVICE(name, NAME, ELF, ...) \
EF_AMDGPU_MACH_AMDGCN_ ## NAME = ELF,
#include "../../gcc/config/gcn/gcn-devices.def"
@@ -1397,7 +1400,7 @@ init_hsa_runtime_functions (void)
DLSYM_FN (hsa_executable_destroy)
DLSYM_FN (hsa_executable_create)
DLSYM_FN (hsa_executable_global_variable_define)
- DLSYM_FN (hsa_executable_load_code_object)
+ DLSYM_FN (hsa_executable_load_agent_code_object)
DLSYM_FN (hsa_executable_freeze)
DLSYM_FN (hsa_signal_create)
DLSYM_FN (hsa_memory_allocate)
@@ -1415,7 +1418,8 @@ init_hsa_runtime_functions (void)
DLSYM_FN (hsa_signal_store_release)
DLSYM_FN (hsa_signal_load_acquire)
DLSYM_FN (hsa_queue_destroy)
- DLSYM_FN (hsa_code_object_deserialize)
+ DLSYM_FN (hsa_code_object_reader_create_from_memory)
+ DLSYM_FN (hsa_code_object_reader_destroy)
DLSYM_OPT_FN (hsa_amd_memory_lock)
DLSYM_OPT_FN (hsa_amd_memory_unlock)
DLSYM_OPT_FN (hsa_amd_memory_async_copy_rect)
@@ -1668,6 +1672,17 @@ elf_gcn_isa_field (Elf64_Ehdr *image)
return image->e_flags & EF_AMDGPU_MACH_MASK;
}
+#define EF_AMDGPU_GENERIC_VERSION_V 0xff000000 /* Mask. */
+#define EF_AMDGPU_GENERIC_VERSION_OFFSET 24
+
+#define GET_GENERIC_VERSION(VAR) ((VAR & EF_AMDGPU_GENERIC_VERSION_V) \
+ >> EF_AMDGPU_GENERIC_VERSION_OFFSET)
+static int
+elf_gcn_isa_is_generic (Elf64_Ehdr *image)
+{
+ return GET_GENERIC_VERSION (image->e_flags);
+}
+
/* Returns the name that the HSA runtime uses for the ISA or NULL if we do not
support the ISA. */
@@ -1694,6 +1709,23 @@ isa_code(const char *isa) {
return EF_AMDGPU_MACH_UNSUPPORTED;
}
+/* To any ISA code, return the associated generic code, if any. Code compiled
+ with the GENERIC ISA code is compatible with ISA. */
+
+static gcn_isa
+get_generic_isa_code (const gcn_isa isa)
+{
+ switch (isa)
+ {
+#define GCN_DEVICE(name, NAME, ELF, ISA, XNACK, SRAM, WAVE64, CU, \
+ MAX_ISA_VGPRS, GEN_VER, ARCH_FAM, GEN_MACH, ...) \
+ case EF_AMDGPU_MACH_AMDGCN_ ## NAME: \
+ return EF_AMDGPU_MACH_AMDGCN_ ## GEN_MACH;
+#include "../../gcc/config/gcn/gcn-devices.def"
+ default: return EF_AMDGPU_MACH_UNSUPPORTED;
+ }
+}
+
/* CDNA2 devices have twice as many VGPRs compared to older devices. */
static int
@@ -2399,38 +2431,65 @@ init_basic_kernel_info (struct kernel_info *kernel,
return true;
}
-/* Check that the GCN ISA of the given image matches the ISA of the agent. */
+/* If status is SUCCESS, assume that the code runs if either the ISA of agent
+ and code is the same - or it is generic code.
+ Otherwise, execution failed with the provided status code; try to give
+ some useful diagnostic. */
static bool
-isa_matches_agent (struct agent_info *agent, Elf64_Ehdr *image)
+isa_matches_agent (struct agent_info *agent, Elf64_Ehdr *image,
+ hsa_status_t status)
{
- int isa_field = elf_gcn_isa_field (image);
- const char* isa_s = isa_name (isa_field);
- if (!isa_s)
- {
- hsa_error ("Unsupported ISA in GCN code object.", HSA_STATUS_ERROR);
- return false;
- }
-
- if (isa_field != agent->device_isa)
- {
- char msg[204];
- const char *agent_isa_s = isa_name (agent->device_isa);
- assert (agent_isa_s);
+ /* Generic image - assume that it works and only return to here
+ when it fails, i.e. fatal == true. */
+ if (status == HSA_STATUS_SUCCESS && elf_gcn_isa_is_generic (image))
+ return true;
- snprintf (msg, sizeof msg,
- "GCN code object ISA '%s' does not match GPU ISA '%s' "
- "(device %d).\n"
- "Try to recompile with '-foffload-options=-march=%s',\n"
- "or use ROCR_VISIBLE_DEVICES to disable incompatible "
- "devices.\n",
- isa_s, agent_isa_s, agent->device_id, agent_isa_s);
-
- hsa_error (msg, HSA_STATUS_ERROR);
- return false;
- }
+ int isa_field = elf_gcn_isa_field (image);
+ if (status == HSA_STATUS_SUCCESS && isa_field == agent->device_isa)
+ return true;
- return true;
+ /* Either nongeneric and mismatch of the ISA - or generic but
+ not handled by the ROCm (e.g. because it is too old). */
+
+ char msg[265];
+ const char *agent_isa_s = isa_name (agent->device_isa);
+ gcn_isa generic_isa = get_generic_isa_code (agent->device_isa);
+ if (agent_isa_s == NULL)
+ snprintf (msg, sizeof msg,
+ "Unsupported ISA %x (%s) for GPU ISA %x (device %d).\n"
+ "Consider using ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices.", isa_field, isa_name (isa_field), agent->device_isa,
+ agent->device_id);
+ else if (generic_isa != EF_AMDGPU_MACH_AMDGCN_NONE && generic_isa == isa_field)
+ snprintf (msg, sizeof msg,
+ "Generic ISA %s in GCN code object not supported for "
+ "GPU ISA '%s' (device %d).\n"
+ "Consider using ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices or update ROCm.", isa_name (generic_isa),
+ agent_isa_s, agent->device_id);
+ else if (generic_isa != EF_AMDGPU_MACH_UNSUPPORTED)
+ snprintf (msg, sizeof msg,
+ "GCN code object ISA '%s' does not match GPU ISA '%s' "
+ "(device %d).\n"
+ "Try to recompile with '-foffload-options=-march=%s' or "
+ "-foffload-options=-march=%s\n"
+ "or use ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices.\n",
+ isa_name (isa_field), agent_isa_s, agent->device_id,
+ agent_isa_s, isa_name (generic_isa));
+ else
+ snprintf (msg, sizeof msg,
+ "GCN code object ISA '%s' does not match GPU ISA '%s' "
+ "(device %d).\n"
+ "Try to recompile with '-foffload-options=-march=%s',\n"
+ "or use ROCR_VISIBLE_DEVICES to disable incompatible "
+ "devices.\n",
+ isa_name (isa_field), agent_isa_s, agent->device_id,
+ agent_isa_s);
+ hsa_error (msg, status != HSA_STATUS_SUCCESS ? status
+ : HSA_STATUS_ERROR);
+ return false;
}
/* Create and finalize the program consisting of all loaded modules. */
@@ -2460,29 +2519,38 @@ create_and_finalize_hsa_program (struct agent_info *agent)
/* Load any GCN modules. */
struct module_info *module = agent->module;
+ hsa_code_object_reader_t reader = {};
if (module)
{
Elf64_Ehdr *image = (Elf64_Ehdr *)module->image_desc->gcn_image->image;
- if (!isa_matches_agent (agent, image))
+ if (!isa_matches_agent (agent, image, HSA_STATUS_SUCCESS))
goto fail;
- hsa_code_object_t co = { 0 };
- status = hsa_fns.hsa_code_object_deserialize_fn
+ status = hsa_fns.hsa_code_object_reader_create_from_memory_fn
(module->image_desc->gcn_image->image,
- module->image_desc->gcn_image->size,
- NULL, &co);
+ module->image_desc->gcn_image->size, &reader);
if (status != HSA_STATUS_SUCCESS)
{
- hsa_error ("Could not deserialize GCN code object", status);
+ hsa_error ("Could not create GCN code object reader", status);
goto fail;
}
-
- status = hsa_fns.hsa_executable_load_code_object_fn
- (agent->executable, agent->id, co, "");
+ status = hsa_fns.hsa_executable_load_agent_code_object_fn
+ (agent->executable, agent->id, reader, "", NULL);
if (status != HSA_STATUS_SUCCESS)
{
- hsa_error ("Could not load GCN code object", status);
+ if (status == HSA_STATUS_ERROR_INVALID_CODE_OBJECT
+ || status == HSA_STATUS_ERROR_INVALID_ISA_NAME)
+ isa_matches_agent (agent, image, status);
+ else
+ {
+ GOMP_PLUGIN_error ("GCN code object ISA '%s' running on GPU ISA "
+ "'%s' (device %d)",
+ isa_name (elf_gcn_isa_field (image)),
+ isa_name (agent->device_isa),
+ agent->device_id);
+ hsa_error ("Could not load GCN code object", status);
+ }
goto fail;
}
@@ -2522,6 +2590,16 @@ create_and_finalize_hsa_program (struct agent_info *agent)
goto fail;
}
+ if (module)
+ {
+ status = hsa_fns.hsa_code_object_reader_destroy_fn (reader);
+ if (status != HSA_STATUS_SUCCESS)
+ {
+ hsa_error ("Could not destroy the GCN code object reader", status);
+ goto fail;
+ }
+ }
+
final:
agent->prog_finalized = true;