Re: [Patch][GCC][middle-end] - Lower store and load neon builtins to gimple

Christophe LYON via Gcc-patches Wed, 20 Oct 2021 23:10:37 -0700


On 20/10/2021 12:16, Richard Biener via Gcc-patches wrote:

On Wed, 20 Oct 2021, Andre Vieira (lists) wrote:

On 27/09/2021 12:54, Richard Biener via Gcc-patches wrote:

On Mon, 27 Sep 2021, Jirui Wu wrote:

Hi all,

I now use the type based on the specification of the intrinsic
instead of type based on formal argument.

I use signed Int vector types because the outputs of the neon builtins
that I am lowering is always signed. In addition, fcode and stmt
does not have information on whether the result is signed.

Because I am replacing the stmt with new_stmt,
a VIEW_CONVERT_EXPR cast is already in the code if needed.
As a result, the result assembly code is correct.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? If OK can it be committed for me, I have no commit rights.

+           tree temp_lhs = gimple_call_lhs (stmt);
+           aarch64_simd_type_info simd_type
+             = aarch64_simd_types[mem_type];
+           tree elt_ptr_type = build_pointer_type (simd_type.eltype);
+           tree zero = build_zero_cst (elt_ptr_type);
+           gimple_seq stmts = NULL;
+           tree base = gimple_convert (&stmts, elt_ptr_type,
+                                       args[0]);
+           new_stmt = gimple_build_assign (temp_lhs,
+                                    fold_build2 (MEM_REF,
+                                    TREE_TYPE (temp_lhs),
+                                    base,
+                                    zero));

this now uses the alignment info as on the LHS of the call by using
TREE_TYPE (temp_lhs) as type of the MEM_REF.  So for example

   typedef int foo __attribute__((vector_size(N),aligned(256)));

   foo tem = ld1 (ptr);

will now access *ptr as if it were aligned to 256 bytes.  But I'm sure
the ld1 intrinsic documents the required alignment (either it's the
natural alignment of the vector type loaded or element alignment?).

For element alignment you'd do sth like

    tree access_type = build_aligned_type (vector_type, TYPE_ALIGN
(TREE_TYPE (vector_type)));

for example.

Richard.

Hi,

I'm taking over this patch from Jirui.

I've decided to use the vector type stored in aarch64_simd_type_info, since
that should always have the correct alignment.

To be fair though, I do wonder whether this is actually needed as is right
now, since the way we cast the inputs and outputs of these __builtins in
arm_neon.h prevents these issues I think, but it is more future proof. Also
you could argue people could use the __builtins directly, though I'd think
that would be at their own risk.

Is this OK?

Yes, this variant looks OK.



Hi Andre,

These new tests fail on aarch64_be:

gcc.target/aarch64/fmla_intrinsic_1.c scan-assembler-times fmadd\\td[0-9]+, 
d[0-9]+, d[0-9]+, d[0-9]+ 2
gcc.target/aarch64/fmla_intrinsic_1.c scan-assembler-times fmla\\tv[0-9]+.2d, 
v[0-9]+.2d, v[0-9]+.d\\[[0-9]+\\] 2
gcc.target/aarch64/fmls_intrinsic_1.c scan-assembler-times fmls\\tv[0-9]+.2d, 
v[0-9]+.2d, v[0-9]+.d\\[[0-9]+\\] 2
gcc.target/aarch64/fmls_intrinsic_1.c scan-assembler-times fmsub\\td[0-9]+, 
d[0-9]+, d[0-9]+, d[0-9]+ 2
gcc.target/aarch64/fmul_intrinsic_1.c scan-assembler-times fmul\\td[0-9]+, 
d[0-9]+, d[0-9]+ 2
gcc.target/aarch64/fmul_intrinsic_1.c scan-assembler-times fmul\\tv[0-9]+.2d, 
v[0-9]+.2d, v[0-9]+.d\\[[0-9]+\\] 2


I've also noticed that:

FAIL: gcc.target/aarch64/vect-vca.c execution test
on aarch64 with -mabi=ilp32

Christophe

Kind regards,
Andre

Re: [Patch][GCC][middle-end] - Lower store and load neon builtins to gimple

Reply via email to