https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293

            Bug ID: 99293
           Summary: Built-in vec_splat generates sub-optimal code for
                    -mcpu=power10
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: munroesj at gcc dot gnu.org
  Target Milestone: ---

Created attachment 50263
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50263&action=edit
Simplified test case

While adding code to Power Vector Library (PVECLIB), for the POWER10 target, I
see strange code generation for Altivec built-in vec_splat for the vector long
long type. I would expect a xxpermdi (xxspltd) based on the "Power Vector
Intrinsic Programming Reference".

But I see the following generated:

0000000000000300 <test_vec_rlq_PWR10>:
     300:   67 02 69 7c     mfvsrld r9,vs35
     304:   67 4b 09 7c     mtvsrdd vs32,r9,r9
     308:   05 00 42 10     vrlq    v2,v2,v0
     30c:   20 00 80 4e     blr

While these seems to functionally correct, the trip through the GPR seems
unnecessary. It requires two serially dependent instructions where a single
xxspltd would do. I expected:

0000000000000300 <test_vec_rlq_PWR10>:
 300:   57 1b 63 f0     xxspltd vs35,vs35,1
 304:   05 18 42 10     vrlq    v2,v2,v3
 308:   20 00 80 4e     blr


The compiler was:

Compiler: gcc version 10.2.1 20210104 (Advance-Toolchain 14.0-2) [2093e873bb6c]
(GCC)

Reply via email to