On 07/02/2025 12:53, Tobias Burnus wrote:
Hi Andrew,

Andrew Stubbs wrote:
I think the correct place for this whole concept might be in the MULTILIB_MATCHES configuration option, not in mkoffload.

In any case, mkoffload needs to know about this; if only the driver ('gcc') knows about it, it comes too late for the early debug file writing. — And if only the compiler itself (lto1, cc1, f951) knows about it, it comes too late for mkoffload, 'as' (llvm-mc) and the collect/ (l)ld run.

That's not how MULTILIB_MATCHES works. The debug .o file would use the same arch as the user specified.

I just realized that I'm assuming that -march=gfx1100 object files will link with -march=gfx11-generic libraries, and produce gfx1100 binaries. Is this not the case?

The mkoffload.cc ELF-writing issue is actually the reason we already check whether the default version (gfx900) has been overridden at compile time - and, hence, already include the required config/multlib include files.

(Maybe not a compelling reason, but when invoking amdgcn-amdhsa- {gcc,gfortran}, there is no need to disallow any version as no library is linked.)

What's the motivation for adding the warning?

I don't like silently changing the specified -march=. But I also want to make it easy for a user to find the option when compiling with, e.g. - march=gfx1100 and only -march=gfx11-generic is available as lib{c,m,gomp,gfortran}.

Thus, I thought having a warning would be useful: – it does by default do the right thing but informs the user why the compiler did something different.

So, the recommended way to silence the warning would be to use the generic arch explicitly?

I don't think any of the restrictions are so interesting for library code. In theory there are some restricted instructions that might be used in libm, perhaps, at some future time, but that's all. The register count restrictions are not interesting at all, since that restricts occupancy, not usage (which is already limited by the ABI).

For now, there is also the issue that only ROCm > 6.3.2 support it, i.e. recompiling with a new distro version would fail with an odd error while it worked in the version before. Using a warning prevents all this.

And otherwise, it is not only library code – it is also hot offloading code. Mixing a generic-code library with a specific-code runtime is not permitted (rejected by lld). And in some cases, I could imagine that some operations could matter.

Library code does not have metadata to conflict (that only comes from entry points - although I just realized that init/fini might break that assumption), so as long as the generic ISA is a strict subset of the specific GPU ISA, it ought to work. But if it doesn't then I guess we're out of luck.

But admittedly, the restrictions aren't that hard. For gfx115x, the scalar ALU floating point instructions and SGPRs are not supported for src1 in data parallel processing (dpp) instructions could matter in theory, but I don't think that we would exploit this – and there are other things to first optimize for.

For AI-style applications, the FP8/BF8/XF32 restrictions could matter with gfx9-4-generic, but we don't support gfx94x yet and, again, we should start with other type of optimizations first.


* * *


This business of changing the -march flag from what the user specified is also questionable.


I concur – but it is the simplest way to permit a user to link the code, point him to the existence of the new -march= flag and avoid gotchas but makes also clear why the flag was changed.

That's based on Richard's comment ...

For distributors it might be good to just ship -generic multilibs and
have all specific -march=gfxXYZ to map to their respective -generic
variant.  That is, consider the configured multilibs when interpreting
-march=gfxXYZ which probably means always configuring the -generic
multilibs (and back to dependence on llvm19 and recent ROCm for the
runtime ...).

That said, I'm happy about -generic, and I hope it ends up in GCC 15
in some way.

... and trying to come up with something that solves this issue but avoids surprises.

I think Richard was assuming that MULTILIB_MATCHES linking would work.


I use locally --with-multilib- list=gfx906,gfx908,gfx90a,gfx90c,gfx1030,gfx1036,gfx1100,gfx1103,gfx9- generic,gfx11-generic (i.e. no gfx900 and no gfx10-3-generic + none of the newly added ones.)

And when plying around for testing all the patches, it works rather smoothly.

Except you have warnings...

Andrew

Reply via email to