On Tue, Jun 06, 2017 at 09:25:51AM +0100, Kyrill Tkachov wrote: > Hi all, > > I'm trying to improve some of the RTL-level handling of vector lane > operations on aarch64 and that involves dealing with a lot of vec_merge > operations. One simplification that I noticed missing from simplify-rtx are > combinations of vec_merge with vec_duplicate. > In this particular case: > (vec_merge (vec_duplicate (X)) (const_vector [A, B]) (const_int N)) > > which can be replaced with > > (vec_concat (X) (B)) if N == 1 (0b01) or > (vec_concat (A) (X)) if N == 2 (0b10). > > For the aarch64 testcase in this patch this simplifications allows us to try > to combine: > (set (reg:V2DI 77 [ x ]) > (vec_concat:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64]) > (const_int 0 [0]))) > > instead of the more complex: > (set (reg:V2DI 77 [ x ]) > (vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 > *y_3(D)+0 S8 A64])) > (const_vector:V2DI [ > (const_int 0 [0]) > (const_int 0 [0]) > ]) > (const_int 1 [0x1]))) > > > For the simplified form above we already have an aarch64 pattern: > *aarch64_combinez<mode> which > is missing a DI/DFmode version due to an oversight, so this patch extends > that pattern as well to > use the VDC mode iterator that includes DI and DFmode (as well as V2HF which > VD_BHSI was missing). > The aarch64 hunk is needed to see the benefit of the simplify-rtx.c hunk, so > I didn't split them > into separate patches. > > Before this for the testcase we'd generate: > construct_lanedi: > movi v0.4s, 0 > ldr x0, [x0] > ins v0.d[0], x0 > ret > > construct_lanedf: > movi v0.2d, 0 > ldr d1, [x0] > ins v0.d[0], v1.d[0] > ret > > but now we can generate: > construct_lanedi: > ldr d0, [x0] > ret > > construct_lanedf: > ldr d0, [x0] > ret > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk?
OK. Thanks, James > 2017-06-06 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * simplify-rtx.c (simplify_ternary_operation, VEC_MERGE): > Simplify vec_merge of vec_duplicate and const_vector. > * config/aarch64/predicates.md (aarch64_simd_or_scalar_imm_zero): > New predicate. > * config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Use VDC > mode iterator. Update predicate on operand 1 to > handle non-const_vec constants. Delete constraints. > (*aarch64_combinez_be<mode>): Likewise for operand 2. > > 2017-06-06 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/aarch64/construct_lane_zero_1.c: New test.