On Tue, Jun 6, 2017 at 1:25 AM, Kyrill Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:
> Hi all,
>
> I'm trying to improve some of the RTL-level handling of vector lane
> operations on aarch64 and that
> involves dealing with a lot of vec_merge operations. One simplification that
> I noticed missing
> from simplify-rtx are combinations of vec_merge with vec_duplicate.
> In this particular case:
> (vec_merge (vec_duplicate (X)) (const_vector [A, B]) (const_int N))
>
> which can be replaced with
>
> (vec_concat (X) (B)) if N == 1 (0b01) or
> (vec_concat (A) (X)) if N == 2 (0b10).
>
> For the aarch64 testcase in this patch this simplifications allows us to try
> to combine:
> (set (reg:V2DI 77 [ x ])
>     (vec_concat:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64])
>         (const_int 0 [0])))
>
> instead of the more complex:
> (set (reg:V2DI 77 [ x ])
>     (vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1
> *y_3(D)+0 S8 A64]))
>         (const_vector:V2DI [
>                 (const_int 0 [0])
>                 (const_int 0 [0])
>             ])
>         (const_int 1 [0x1])))
>
>
> For the simplified form above we already have an aarch64 pattern:
> *aarch64_combinez<mode> which
> is missing a DI/DFmode version due to an oversight, so this patch extends
> that pattern as well to
> use the VDC mode iterator that includes DI and DFmode (as well as V2HF which
> VD_BHSI was missing).
> The aarch64 hunk is needed to see the benefit of the simplify-rtx.c hunk, so
> I didn't split them
> into separate patches.
>
> Before this for the testcase we'd generate:
> construct_lanedi:
>         movi    v0.4s, 0
>         ldr     x0, [x0]
>         ins     v0.d[0], x0
>         ret
>
> construct_lanedf:
>         movi    v0.2d, 0
>         ldr     d1, [x0]
>         ins     v0.d[0], v1.d[0]
>         ret
>
> but now we can generate:
> construct_lanedi:
>         ldr     d0, [x0]
>         ret
>
> construct_lanedf:
>         ldr     d0, [x0]
>         ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2017-06-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>
>     * simplify-rtx.c (simplify_ternary_operation, VEC_MERGE):
>     Simplify vec_merge of vec_duplicate and const_vector.
>     * config/aarch64/predicates.md (aarch64_simd_or_scalar_imm_zero):
>     New predicate.
>     * config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Use VDC
>     mode iterator.  Update predicate on operand 1 to
>     handle non-const_vec constants.  Delete constraints.
>     (*aarch64_combinez_be<mode>): Likewise for operand 2.
>
> 2017-06-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>
>     * gcc.target/aarch64/construct_lane_zero_1.c: New test.

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85090

-- 
H.J.

Reply via email to