On 27/06/17 23:29, Jeff Law wrote:
On 06/06/2017 02:25 AM, Kyrill Tkachov wrote:
Hi all,
I'm trying to improve some of the RTL-level handling of vector lane
operations on aarch64 and that
involves dealing with a lot of vec_merge operations. One simplification
that I noticed missing
from simplify-rtx are combinations of vec_merge with vec_duplicate.
In this particular case:
(vec_merge (vec_duplicate (X)) (const_vector [A, B]) (const_int N))
which can be replaced with
(vec_concat (X) (B)) if N == 1 (0b01) or
(vec_concat (A) (X)) if N == 2 (0b10).
For the aarch64 testcase in this patch this simplifications allows us to
try to combine:
(set (reg:V2DI 77 [ x ])
(vec_concat:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64])
(const_int 0 [0])))
instead of the more complex:
(set (reg:V2DI 77 [ x ])
(vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1
*y_3(D)+0 S8 A64]))
(const_vector:V2DI [
(const_int 0 [0])
(const_int 0 [0])
])
(const_int 1 [0x1])))
For the simplified form above we already have an aarch64 pattern:
*aarch64_combinez<mode> which
is missing a DI/DFmode version due to an oversight, so this patch
extends that pattern as well to
use the VDC mode iterator that includes DI and DFmode (as well as V2HF
which VD_BHSI was missing).
The aarch64 hunk is needed to see the benefit of the simplify-rtx.c
hunk, so I didn't split them
into separate patches.
Before this for the testcase we'd generate:
construct_lanedi:
movi v0.4s, 0
ldr x0, [x0]
ins v0.d[0], x0
ret
construct_lanedf:
movi v0.2d, 0
ldr d1, [x0]
ins v0.d[0], v1.d[0]
ret
but now we can generate:
construct_lanedi:
ldr d0, [x0]
ret
construct_lanedf:
ldr d0, [x0]
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill
2017-06-06 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* simplify-rtx.c (simplify_ternary_operation, VEC_MERGE):
Simplify vec_merge of vec_duplicate and const_vector.
* config/aarch64/predicates.md (aarch64_simd_or_scalar_imm_zero):
New predicate.
* config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Use VDC
mode iterator. Update predicate on operand 1 to
handle non-const_vec constants. Delete constraints.
(*aarch64_combinez_be<mode>): Likewise for operand 2.
2017-06-06 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* gcc.target/aarch64/construct_lane_zero_1.c: New test.
OK for the simplify-rtx parts.
Thanks Jeff.
Pinging the aarch64 parts at:
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00272.html
I've re-bootstrapped and re-tested the patches on top of current trunk.
Kyrill
jeff