On 7/15/20 9:13 AM, Roger Sayle wrote:
> 
> This patch provides standard vec_extract and vec_set patterns to the
> nvptx backend, to extract an element from a PTX vector and set an
> element of a PTX vector respectively.  PTX vectors (I hesitate to
> call them SIMD vectors) may contain up to four elements, so vector
> modes up to size four are supported by this patch even though the
> nvptx backend currently only allows V2SI and V2DI, i.e. two out
> of the ten possible vector modes.
> 
> As an example of the improvement, the following C function:
> 
> typedef int __v2si __attribute__((__vector_size__(8)));
> int foo (__v2si arg) { return arg[0]+arg[1]; }
> 
> previously generated this code using a shift:
> 
>                 mov.u64 %r25, %ar0;
>                 ld.v2.u32       %r26, [%r25];
>                 mov.b64 %r28, %r26;
>                 shr.s64 %r30, %r28, 32;
>                 cvt.u32.u32     %r31, %r26.x;
>                 cvt.u32.u64     %r32, %r30;
>                 add.u32 %value, %r31, %r32;
> 
> but with this patch now generates:
> 
>                 mov.u64 %r25, %ar0;
>                 ld.v2.u32       %r26, [%r25];
>                 mov.u32 %r28, %r26.x;
>                 mov.u32 %r29, %r26.y;
>                 add.u32 %value, %r28, %r29;
> 
> I've implemented these getters and setters as their own instructions
> instead of attempting the much more intrusive patch of changing the
> backend's definition of register_operand.  Given the limited utility
> of PTX vectors, I'm not convinced that attempting to support them as
> operands in every instruction would be worth the effort involved.
> 
> This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
> with "make" and "make check" with no new regressions.
> Ok for mainline?
> 
> 
> 2020-07-15  Roger Sayle  <ro...@nextmovesoftware.com>
> 
> gcc/ChangeLog:
>       * config/nvptx/nvptx.md (nvptx_vector_index_operand): New predicate.
>       (VECELEM): New mode attribute for a vector's uppercase element mode.
>       (Vecelem): New mode attribute for a vector's lowercase element mode.
>       (*vec_set<mode>_0, *vec_set<mode>_1, *vec_set<mode>_2,
>       *vec_set<mode>_3): New instructions.
>       (vec_set<mode>): New expander to generate one of the above insns.
>       (vec_extract<mode><Vecelem>): New instruction.

Added test-case, fixed some nits, pushed (not reposting).

Thanks,
- Tom

Reply via email to