Re: [Patch] [gcn] mkoffload.cc: Automatically use gfx*-generic if -march has no multilib but generic has

Andrew Stubbs Fri, 07 Feb 2025 05:09:25 -0800

On 07/02/2025 12:53, Tobias Burnus wrote:

Hi Andrew,
Andrew Stubbs wrote:
I think the correct place for this whole concept might be in theMULTILIB_MATCHES configuration option, not in mkoffload.
In any case, mkoffload needs to know about this; if only the driver('gcc') knows about it, it comes too late for the early debug filewriting. — And if only the compiler itself (lto1, cc1, f951) knows aboutit, it comes too late for mkoffload, 'as' (llvm-mc) and the collect/(l)ld run.

That's not how MULTILIB_MATCHES works. The debug .o file would use thesame arch as the user specified.

I just realized that I'm assuming that -march=gfx1100 object files willlink with -march=gfx11-generic libraries, and produce gfx1100 binaries.Is this not the case?

The mkoffload.cc ELF-writing issue is actually the reason we alreadycheck whether the default version (gfx900) has been overridden atcompile time - and, hence, already include the required config/multlibinclude files.
(Maybe not a compelling reason, but when invoking amdgcn-amdhsa-{gcc,gfortran}, there is no need to disallow any version as no libraryis linked.)
What's the motivation for adding the warning?
I don't like silently changing the specified -march=. But I also want tomake it easy for a user to find the option when compiling with, e.g. -march=gfx1100 and only -march=gfx11-generic is available aslib{c,m,gomp,gfortran}.
Thus, I thought having a warning would be useful: – it does by defaultdo the right thing but informs the user why the compiler did somethingdifferent.

So, the recommended way to silence the warning would be to use thegeneric arch explicitly?

I don't think any of the restrictions are so interesting for librarycode.In theory there are some restricted instructions that might be used inlibm, perhaps, at some future time, but that's all. The register countrestrictions are not interesting at all, since that restrictsoccupancy, not usage (which is already limited by the ABI).
For now, there is also the issue that only ROCm > 6.3.2 support it, i.e.recompiling with a new distro version would fail with an odd error whileit worked in the version before. Using a warning prevents all this.
And otherwise, it is not only library code – it is also hot offloadingcode. Mixing a generic-code library with a specific-code runtime is notpermitted (rejected by lld). And in some cases, I could imagine thatsome operations could matter.

Library code does not have metadata to conflict (that only comes fromentry points - although I just realized that init/fini might break thatassumption), so as long as the generic ISA is a strict subset of thespecific GPU ISA, it ought to work. But if it doesn't then I guess we'reout of luck.

But admittedly, the restrictions aren't that hard. For gfx115x, thescalar ALU floating point instructions and SGPRs are not supported forsrc1 in data parallel processing (dpp) instructions could matter intheory, but I don't think that we would exploit this – and there areother things to first optimize for.
For AI-style applications, the FP8/BF8/XF32 restrictions could matterwith gfx9-4-generic, but we don't support gfx94x yet and, again, weshould start with other type of optimizations first.
* * *
This business of changing the -march flag from what the user specifiedis also questionable.
I concur – but it is the simplest way to permit a user to link the code,point him to the existence of the new -march= flag and avoid gotchas butmakes also clear why the flag was changed.
That's based on Richard's comment ...
For distributors it might be good to just ship -generic multilibs and
have all specific -march=gfxXYZ to map to their respective -generic
variant.  That is, consider the configured multilibs when interpreting
-march=gfxXYZ which probably means always configuring the -generic
multilibs (and back to dependence on llvm19 and recent ROCm for the
runtime ...).

That said, I'm happy about -generic, and I hope it ends up in GCC 15
in some way.
... and trying to come up with something that solves this issue butavoids surprises.


I think Richard was assuming that MULTILIB_MATCHES linking would work.

I use locally --with-multilib-list=gfx906,gfx908,gfx90a,gfx90c,gfx1030,gfx1036,gfx1100,gfx1103,gfx9-generic,gfx11-generic (i.e. no gfx900 and no gfx10-3-generic + none ofthe newly added ones.)
And when plying around for testing all the patches, it works rathersmoothly.


Except you have warnings...

Andrew

Re: [Patch] [gcn] mkoffload.cc: Automatically use gfx*-generic if -march has no multilib but generic has

Reply via email to