On 20/01/2020 11:07, Jakub Jelinek wrote:
On Mon, Jan 20, 2020 at 11:00:58AM +0000, Andrew Stubbs wrote:
Indeed, fat binaries would be a good solution.

Presumably it's possible, but I'm not sure how we'd go about getting the
offload mechanism to launch the backend multiple times? Having got that far,
the libgomp and mkoffload changes to select the right variant would probably
be fairly straight-forward.

I'd say easiest would be to do that in the gcn specific mkoffload.
But there needs to be a way for the user to specify that he wants only a
particular variant and not all of them (perhaps look for -march= in the
offload options?)?

Yeah, maybe "-foffload=-march=gfx900+", or "-foffload=-march=fiji,gfx900,gfx906"?

Or, for the 8 vs. 16 regs, have -march=generic or whatever that would try to
generate something that will work everywhere or on as many chips as
possible, e.g. by using mostly fiji, but try to use 16 adjacent regs instead
of 8?  I admit I don't know anything about the hw, just worried because if
we have already 4 variants now when the port is almost new, won't we have 30
later on, which could be prohibitive for the fat binaries?

That might work. It'd be far from optimal, but hopefully still faster than CPU.

AMD don't have any real interest in maintaining compatibility though, so this may get increasingly difficult. For example, between Fiji and Vega (gfx8xx to gfx9xx), they removed the v_moverel instructions, and removed a number of bit-fields from the memory descriptors. As it happens, GCC does not (currently) use any of those features, so compatibility was unaffected.

For another example, AMD changed the name of the v_add instructions to v_add_co, and added a new set of instructions named v_add (that don't have carry-out). The machine encodings for the old instructions remain the same, so again, binary compatibility was not affected, but it serves to demonstrate that they don't expect software to be written for a generic device.

Also, APUs will probably never be binary compatible with DGPUs (not that libgomp supports APUs properly, at present, as we have none to test).

Andrew

Reply via email to