Re: [Patch] [gcn] mkoffload.cc: Automatically use gfx*-generic if -march has no multilib but generic has

Tobias Burnus Fri, 07 Feb 2025 04:53:43 -0800

Hi Andrew,

Andrew Stubbs wrote:

I think the correct place for this whole concept might be in theMULTILIB_MATCHES configuration option, not in mkoffload.

In any case, mkoffload needs to know about this; if only the driver('gcc') knows about it, it comes too late for the early debug filewriting. — And if only the compiler itself (lto1, cc1, f951) knows aboutit, it comes too late for mkoffload, 'as' (llvm-mc) and thecollect/(l)ld run.

The mkoffload.cc ELF-writing issue is actually the reason we alreadycheck whether the default version (gfx900) has been overridden atcompile time - and, hence, already include the required config/multlibinclude files.

(Maybe not a compelling reason, but when invokingamdgcn-amdhsa-{gcc,gfortran}, there is no need to disallow any versionas no library is linked.)

What's the motivation for adding the warning?

I don't like silently changing the specified -march=. But I also want tomake it easy for a user to find the option when compiling with, e.g.-march=gfx1100 and only -march=gfx11-generic is available aslib{c,m,gomp,gfortran}.

Thus, I thought having a warning would be useful: – it does by defaultdo the right thing but informs the user why the compiler did somethingdifferent.

I don't think any of the restrictions are so interesting for library code.
In theory there are some restricted instructions that might be used inlibm, perhaps, at some future time, but that's all. The register countrestrictions are not interesting at all, since that restrictsoccupancy, not usage (which is already limited by the ABI).

For now, there is also the issue that only ROCm > 6.3.2 support it, i.e.recompiling with a new distro version would fail with an odd error whileit worked in the version before. Using a warning prevents all this.

And otherwise, it is not only library code – it is also hot offloadingcode. Mixing a generic-code library with a specific-code runtime is notpermitted (rejected by lld). And in some cases, I could imagine thatsome operations could matter.

But admittedly, the restrictions aren't that hard. For gfx115x, thescalar ALU floating point instructions and SGPRs are not supported forsrc1 in data parallel processing (dpp) instructions could matter intheory, but I don't think that we would exploit this – and there areother things to first optimize for.

For AI-style applications, the FP8/BF8/XF32 restrictions could matterwith gfx9-4-generic, but we don't support gfx94x yet and, again, weshould start with other type of optimizations first.



* * *

This business of changing the -march flag from what the user specifiedis also questionable.

I concur – but it is the simplest way to permit a user to link the code,point him to the existence of the new -march= flag and avoid gotchas butmakes also clear why the flag was changed.


That's based on Richard's comment ...

For distributors it might be good to just ship -generic multilibs and
have all specific -march=gfxXYZ to map to their respective -generic
variant.  That is, consider the configured multilibs when interpreting
-march=gfxXYZ which probably means always configuring the -generic
multilibs (and back to dependence on llvm19 and recent ROCm for the
runtime ...).

That said, I'm happy about -generic, and I hope it ends up in GCC 15
in some way.

... and trying to come up with something that solves this issue butavoids surprises.

I use locally--with-multilib-list=gfx906,gfx908,gfx90a,gfx90c,gfx1030,gfx1036,gfx1100,gfx1103,gfx9-generic,gfx11-generic(i.e. no gfx900 and no gfx10-3-generic + none of the newly added ones.)

And when plying around for testing all the patches, it works rathersmoothly.


Tobias

Re: [Patch] [gcn] mkoffload.cc: Automatically use gfx*-generic if -march has no multilib but generic has

Reply via email to