[pushed 0/8] aarch64: Fix regression in vec_init code quality

Richard Sandiford via Gcc-patches Wed, 09 Feb 2022 09:00:17 -0800

The main purpose of this patch series is to fix a performance
regression from GCC 8.  Before the patch:


int64x2_t s64q_1(int64_t a0, int64_t a1) {
  if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
    return (int64x2_t) { a1, a0 };
  else
    return (int64x2_t) { a0, a1 };
}

generated:

        fmov    d0, x0
        ins     v0.d[1], x1
        ins     v0.d[1], x1
        ret

whereas GCC 8 generated the more respectable:

        dup     v0.2d, x0
        ins     v0.d[1], x1
        ret

But there are some related knock-on changes that IMO are needed to keep
things in a consistent and maintainable state.

There is still more cleanup and optimisation that could be done in this
area, but that's definitely GCC 13 material.

Tested on aarch64-linux-gnu and aarch64_be-elf, pushed.

Sorry for the size of the series, but it really did seem like the
best fix in the circumstances.

Richard

[pushed 0/8] aarch64: Fix regression in vec_init code quality

Reply via email to