https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332

--- Comment #6 from ktkachov at gcc dot gnu.org ---
Author: ktkachov
Date: Thu Jun  6 13:59:07 2019
New Revision: 272002

URL: https://gcc.gnu.org/viewcvs?rev=272002&root=gcc&view=rev
Log:
[AArch64] PR tree-optimization/90332: Implement vec_init<M><N> where N is a
vector mode

This patch fixes the failing gcc.dg/vect/slp-reduc-sad-2.c testcase on aarch64
by implementing a vec_init optab that can handle two half-width vectors
producing a full-width one
by concatenating them.

In the gcc.dg/vect/slp-reduc-sad-2.c case it's a V8QI reg concatenated with a
V8QI const_vector of zeroes.
This can be implemented efficiently using the aarch64_combinez pattern that
just loads a D-register to make
use of the implicit zero-extending semantics of that load.
Otherwise it concatenates the two vector using aarch64_simd_combine.

With this patch I'm seeing the effect from richi's original patch that added
gcc.dg/vect/slp-reduc-sad-2.c on aarch64
and 525.x264_r improves by about 1.5%.

        PR tree-optimization/90332
        * config/aarch64/aarch64.c (aarch64_expand_vector_init):
        Handle VALS containing two vectors.
        * config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Rename
        to...
        (@aarch64_combinez<mode>): ... This.
        (*aarch64_combinez_be<mode>): Rename to...
        (@aarch64_combinez_be<mode>): ... This.
        (vec_init<mode><Vhalf>): New define_expand.
        * config/aarch64/iterators.md (Vhalf): Handle V8HF.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/aarch64/aarch64-simd.md
    trunk/gcc/config/aarch64/aarch64.c
    trunk/gcc/config/aarch64/iterators.md

Reply via email to