On 15/03/2024 13:56, Tobias Burnus wrote:
Hi Andrew,

Andrew Stubbs wrote:
This is more-or-less what I was planning to do myself, but as I want to include all the other features that get parametrized in gcn.cc, gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet. Unfortunately, I think the gcn.opt and config.gcc will always need manually updating, but if that's all it'll be an improvement.

Well, for .opt see how nvptx does it – it actually generates an .opt file.

I don't like the idea of including AMDGPU_ISA_UNSUPPORTED;

I concur – I was initially thinking of reporting the device name ("Unsupported %s") but I then realized that the agent returns a string while only for GCC generated files (→ eflag) the hexcode is used. Thus, I ended up not using it.

Ultimately, I want to replace many of the conditionals like "TARGET_CDNA2_PLUS" from the code and replace them with feature flags derived from a def file, or at least a header file. We've acquired too many places where there are unsearchable conditionals that need finding and fixing every time a new device comes along.
I was thinking of having more flags, but those where the only ones required for the two files.
I had imagined that this .def file would exist in gcc/config/gcn, but you've placed it in libgomp.... maybe it makes sense to have multiple such files if they contain very different data, but I had imagined one file and I'm not sure that the compiler definitions live in libgomp.

There is already:

gcc/config/darwin-c.cc:#include "../../libcpp/internal.h"

gcc/config/gcn/gcn-run.cc:#include "../../../libgomp/config/gcn/libgomp-gcn.h"

gcc/fortran/cpp.cc:#include "../../libcpp/internal.h"

gcc/fortran/trigd_fe.inc:#include "../../libgfortran/intrinsics/trigd.inc"

But there is also the reverse:

libcpp/lex.cc:#include "../gcc/config/i386/cpuid.h"

libgfortran/libgfortran.h:#include "../gcc/fortran/libgfortran.h"

lto-plugin/lto-plugin.c:#include "../gcc/lto/common.h"

If you add more items, it is probably better to have it under gcc/config/gcn/ - and I really prefer a single file for all.

* * *

Talking about feature sets: This would be a bit like LLVM (see below) but I think they have a bit too much indirections. But I do concur that we need to consolidate the current support – and hopefully make it easier to keep adding more GPU support; we seem to have already covered a larger chunk :-)

I also did wonder whether we should support, e.g., running a gfx1100 code (or a gfx11-generic one) on, e.g., a gfx1103 device. Alternatively, we could keep the current check which requires an exact match.

We didn't invent that restriction; the runtime won't let you do it. We only have the check because the HSA/ROCr error message is not very user-friendly.

BTW: I do note that looking at the feature sets in LLVM that all GFX110x GPUs seem to have common silicon bugs: FeatureMSAALoadDstSelBug and FeatureMADIntraFwdBug, while 1100 and 1102 additionally have the FeatureUserSGPRInit16Bug but 1101 and 1103 don't. — For some reasons, FeatureISAVersion11_Generic only consists of two of those bugs (it doesn't have FeatureMADIntraFwdBug), which doesn't seem to be that consistent. Maybe the workaround has issues elsewhere? If so, a generic -march=gfx11 might be not as useful as one might hope for.

* * *

If I look at LLVM's https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPU.td ,

they first define several features – like 'FeatureUnalignedScratchAccess'.

Then they combine them like in:

def FeatureISAVersion11_Common ... [FeatureGFX11, ... FeatureAtomicFaddRtnInsts ...

And then they use those to map them to feature sets like:

def FeatureISAVersion11_0_Common ... listconcat(FeatureISAVersion11_Common.Features,
     [FeatureMSAALoadDstSelBug ...

And for gfx1103:

def FeatureISAVersion11_0_3 : FeatureSet<
   !listconcat(FeatureISAVersion11_0_Common.Features,
     [])>;

The mapping to gfx... names then happens in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/GCNProcessors.td such as:

def : ProcessorModel<"gfx1103", GFX11SpeedModel,
   FeatureISAVersion11_0_3.Features
 >;

Or for the generic one, i.e.:

// [gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151]
def : ProcessorModel<"gfx11-generic", GFX11SpeedModel,
   FeatureISAVersion11_Generic.Features

LLVM also has some generic flags like the following in https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/TargetParser.cpp

    {{"gfx1013"},   {"gfx1013"}, GK_GFX1013, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32|FEATURE_XNACK|FEATURE_WGP},

I hope that this will give some inspiration – but I assume that at least the initial implementation will be much shorter.

Yeah, we can have one macro for each arch, or multiple macros for building different tables. First one seems easier but less readable, second one will need some thinking about. Probably best to keep it simple though.

Andrew

Reply via email to