On 09/06/2023 10:02, Richard Sandiford wrote:
Andrew Stubbs <a...@codesourcery.com> writes:
On 07/06/2023 20:42, Richard Sandiford wrote:
I don't know if this helps (probably not), but we have a similar
situation on AArch64: a 64-bit mode like V8QI can be doubled to a
128-bit vector or to a pair of 64-bit vectors.  We used V16QI for
the former and "V2x8QI" for the latter.  V2x8QI is forced to come
after V16QI in the mode list, and so it is only ever used through
explicit choice.  But both modes are functionally vectors of 16 QIs.

OK, that's interesting, but how do you map "complex int" vectors to that
mode? I tried to figure it out, but there's no DIVMOD support so I
couldn't just do a straight comparison.

Yeah, we don't do that currently.  Instead we make TARGET_ARRAY_MODE
return V2x8QI for an array of 2 V8QIs (which is OK, since V2x8QI has
64-bit rather than 128-bit alignment).  So we should use it for a
complex-y type like:

   struct { res_type res[2]; };

In principle we should be able to do the same for:

   struct { res_type a, b; };

but that isn't supported yet.  I think it would need a new target hook
along the lines of TARGET_ARRAY_MODE, but for structs rather than arrays.

The advantage of this from AArch64's PoV is that it extends to 3x and 4x
tuples as well, whereas complex is obviously for pairs only.

I don't know if it would be acceptable to use that kind of struct wrapper
for the divmod code though (for the vector case only).

Looking again, I don't think this will help because GCN does not have an instruction that loads vectors that are back-to-back, hence there's little benefit in adding the tuple mode.

However, GCN does have instructions that effectively load 2, 3, or 4 vectors that are *interleaved*, which would be the likely case for complex numbers (or pixel colour data!)

I need to figure out how to move forward with this patch, please; if the new complex modes are not acceptable then I think I need to reimplement DIVMOD (maybe the scalars can remain as-is), but it's not clear to me what that would look like.

Andrew

Reply via email to