On 7/15/20 9:13 AM, Roger Sayle wrote: > > This patch provides standard vec_extract and vec_set patterns to the > nvptx backend, to extract an element from a PTX vector and set an > element of a PTX vector respectively. PTX vectors (I hesitate to > call them SIMD vectors) may contain up to four elements, so vector > modes up to size four are supported by this patch even though the > nvptx backend currently only allows V2SI and V2DI, i.e. two out > of the ten possible vector modes. > > As an example of the improvement, the following C function: > > typedef int __v2si __attribute__((__vector_size__(8))); > int foo (__v2si arg) { return arg[0]+arg[1]; } > > previously generated this code using a shift: > > mov.u64 %r25, %ar0; > ld.v2.u32 %r26, [%r25]; > mov.b64 %r28, %r26; > shr.s64 %r30, %r28, 32; > cvt.u32.u32 %r31, %r26.x; > cvt.u32.u64 %r32, %r30; > add.u32 %value, %r31, %r32; > > but with this patch now generates: > > mov.u64 %r25, %ar0; > ld.v2.u32 %r26, [%r25]; > mov.u32 %r28, %r26.x; > mov.u32 %r29, %r26.y; > add.u32 %value, %r28, %r29; > > I've implemented these getters and setters as their own instructions > instead of attempting the much more intrusive patch of changing the > backend's definition of register_operand. Given the limited utility > of PTX vectors, I'm not convinced that attempting to support them as > operands in every instruction would be worth the effort involved. > > This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu > with "make" and "make check" with no new regressions. > Ok for mainline? > > > 2020-07-15 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog: > * config/nvptx/nvptx.md (nvptx_vector_index_operand): New predicate. > (VECELEM): New mode attribute for a vector's uppercase element mode. > (Vecelem): New mode attribute for a vector's lowercase element mode. > (*vec_set<mode>_0, *vec_set<mode>_1, *vec_set<mode>_2, > *vec_set<mode>_3): New instructions. > (vec_set<mode>): New expander to generate one of the above insns. > (vec_extract<mode><Vecelem>): New instruction.
Added test-case, fixed some nits, pushed (not reposting). Thanks, - Tom