Hi Andrew,
Andrew Stubbs wrote:
I think the correct place for this whole concept might be in the
MULTILIB_MATCHES configuration option, not in mkoffload.
In any case, mkoffload needs to know about this; if only the driver
('gcc') knows about it, it comes too late for the early debug file
writing. — And if only the compiler itself (lto1, cc1, f951) knows about
it, it comes too late for mkoffload, 'as' (llvm-mc) and the
collect/(l)ld run.
The mkoffload.cc ELF-writing issue is actually the reason we already
check whether the default version (gfx900) has been overridden at
compile time - and, hence, already include the required config/multlib
include files.
(Maybe not a compelling reason, but when invoking
amdgcn-amdhsa-{gcc,gfortran}, there is no need to disallow any version
as no library is linked.)
What's the motivation for adding the warning?
I don't like silently changing the specified -march=. But I also want to
make it easy for a user to find the option when compiling with, e.g.
-march=gfx1100 and only -march=gfx11-generic is available as
lib{c,m,gomp,gfortran}.
Thus, I thought having a warning would be useful: – it does by default
do the right thing but informs the user why the compiler did something
different.
I don't think any of the restrictions are so interesting for library code.
In theory there are some restricted instructions that might be used in
libm, perhaps, at some future time, but that's all. The register count
restrictions are not interesting at all, since that restricts
occupancy, not usage (which is already limited by the ABI).
For now, there is also the issue that only ROCm > 6.3.2 support it, i.e.
recompiling with a new distro version would fail with an odd error while
it worked in the version before. Using a warning prevents all this.
And otherwise, it is not only library code – it is also hot offloading
code. Mixing a generic-code library with a specific-code runtime is not
permitted (rejected by lld). And in some cases, I could imagine that
some operations could matter.
But admittedly, the restrictions aren't that hard. For gfx115x, the
scalar ALU floating point instructions and SGPRs are not supported for
src1 in data parallel processing (dpp) instructions could matter in
theory, but I don't think that we would exploit this – and there are
other things to first optimize for.
For AI-style applications, the FP8/BF8/XF32 restrictions could matter
with gfx9-4-generic, but we don't support gfx94x yet and, again, we
should start with other type of optimizations first.
* * *
This business of changing the -march flag from what the user specified
is also questionable.
I concur – but it is the simplest way to permit a user to link the code,
point him to the existence of the new -march= flag and avoid gotchas but
makes also clear why the flag was changed.
That's based on Richard's comment ...
For distributors it might be good to just ship -generic multilibs and
have all specific -march=gfxXYZ to map to their respective -generic
variant. That is, consider the configured multilibs when interpreting
-march=gfxXYZ which probably means always configuring the -generic
multilibs (and back to dependence on llvm19 and recent ROCm for the
runtime ...).
That said, I'm happy about -generic, and I hope it ends up in GCC 15
in some way.
... and trying to come up with something that solves this issue but
avoids surprises.
I use locally
--with-multilib-list=gfx906,gfx908,gfx90a,gfx90c,gfx1030,gfx1036,gfx1100,gfx1103,gfx9-generic,gfx11-generic
(i.e. no gfx900 and no gfx10-3-generic + none of the newly added ones.)
And when plying around for testing all the patches, it works rather
smoothly.
Tobias