Alan Lawrence wrote:
Hi,
Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC
vector extensions currently generates very poor assembly code, for example
"uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3):
fmov x0, d0 // 22 movdi_aarch64/12 [length = 4]
fmov x1, d1 // 23 movdi_aarch64/12 [length = 4]
cmp x0, x1 // 10 cmpdi/1 [length = 4]
csinv x0, xzr, xzr, cc // 17 cmovdi_insn/3 [length = 4]
fmov d0, x0 // 24 *movdi_aarch64/11 [length = 4]
ret // 27 simple_return [length = 4]
Meaning that arm_neon.h instead has to use rather awkward forms like "return
(uint64x1_t) {__a[0] >= __b[0] ? -1ll : 0ll};" to produce the desired assembly
cmhs d0, d0, d1
ret
This series adds vcond(u?)didi patterns for AArch64, to generate appropriate RTL
from direct comparisons of 64x1 vectors (which are of DImode). However, as
things stand, adding a vconddidi pattern causes an ICE in vector_compare_rtx
(maybe_legitimize_operands), because a DImode constant-zero (vector or
otherwise) is expanded as const0_rtx, which has mode VOIDmode. I tried quite a
bit to generate an RTL const_vector, or even just something with mode DImode,
but without success, hence the first patch fixes vector_compare_rtx to use the
mode from the tree if necessary. (DImode vectors are specifically allowed by
stor-layout.c, but no other platform defines vconddidi.)
Can I ping the AArch64 parts of this (patches 2+3)? These then provide the
testcases requested by Jeff Law in his approval of the first patch
(https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00076.html).
Thanks, Alan