Hello,

On 3.9.19. 21:16, Jeff Law wrote:
On 9/3/19 2:16 AM, Mihailo Stojanovic wrote:
From: Mihailo Stojanovic <mistojano...@wavecomp.com>

Hi everybody,

This fixes the MSA implementation on big-endian targets which is
essentially broken for things like SUBREG handling and calling
convention for vector types. It borrows heavily from [1] as Aarch64 has
the same problem with SVE vectors.

Conceptually, register bitconverts should act as the data has been
stored to memory in one mode, and loaded from memory in the other.
This isn't what happens on big-endian as vector load/store instructions
are essentially mixed-endian with respect to the vector as a whole.
The in-register representation of data must be changed so that the
load/store round-trip becomes valid. This is done by inserting one or
two shuffle instructions for every SUBREG move, as previously
implemented in [2] for LLVM. Even if the shuffle instructions weren't
generated, constraint in mips_can_change_mode_class would force the
conceptual memory reload of SUBREG move operand, which would generate
correct, albeit very inefficient code.

New msa_reg_predicate was created in order to forbid SUBREG operands in
MSA patterns on big-endian targets. It weeds SUBREGs out of the
instruction patterns into SUBREG->REG moves which are caught by the new
msa_mov<mode>_subreg_be pattern and transformed into shuffle(s).

As for the MSA calling convention, ABI states that compiling for MSA
should not change the base ABIs vector calling convention, that is, MSA
vectors passed or returned by value do not use the MSA vector registers.
Instead, they are passed by general-purpose registers, as described by
the ABI. Passing the vector argument requires splitting it into 2 (or 4)
general-purpose registers and bitconverting it into V2DImode (or
V4SImode). The solution boils down to the one presented for SUBREG
moves: force every vector argument to the appropriate mode (V2DI or
V4SI) so that the shuffle instruction(s) might be generated in order to
conform to the calling convention. The same applies to vectors as return
values.

New testcases were added (thanks Dragan!) to check calling convention
compliance for all possible combinations of MSA and non-MSA
interlinking.

This fixes the following vectorization test failures:

     vect/pr66251.c
     vect/vect-alias-check-10.c
     vect/vect-alias-check-11.c
     vect/vect-alias-check-12.c
     vect/vect-bswap32.c
     vect/vect-bswap64.c
     vect/vect-nop-move.c
     vect/slp-41.c
     vect/slp-43.c
     vect/slp-45.c
     vect/slp-perm-11.c

Tested on mips32-mti-linux-gnu and mips64-mti-linux-gnu.

Regards,
Mihailo

[1] https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02176.html
[2] https://reviews.llvm.org/rL189330

gcc/ChangeLog:

2019-08-08  Mihailo Stojanovic  <mihailo.stojano...@rt-rk.com>

     * config/mips/mips-msa.md: Replace register_operand predicate with
     msa_reg_operand in every pattern.
     (msa_change_mode): New pattern.
     (*msa_change_mode_<mode>): New unspec.
     (*msa_mov<mode>_subreg_be): New unspec.
     * config/mips/mips-protos.h (mips_split_msa_subreg_move): Declare.
     * config/mips/mips.c (mips_maybe_expand_msa_subreg_move): New
     function.
     (mips_replace_reg_mode): Ditto.
     (mips_split_msa_subreg_move): Ditto.
     (mips_legitimize_move): Modify machine modes of MSA vectors which
     reside in general-purpose registers. Check whether SUBREG move can
     be replaced with shuffle(s).
     (mips_split_128bit_move): Replace new SUBREGs with REGs.
     (mips_split_msa_copy_d): Ditto.
     (mips_split_msa_insert_d): Ditto.
     (mips_split_msa_fill_d): Ditto.
     (mips_can_change_mode_class): Disallow mode change which would
     result in lane width change.
     (mips_expand_vec_unpack): Replace new SUBREG with REG on big-endian
     targets.
     * config/mips/predicates.md (msa_reg_operand): New predicate.

gcc/testsuite/ChangeLog:

2019-08-08  Dragan Mladjenovic  <dragan.mladjeno...@rt-rk.com>

     * gcc.target/mips/inter/msa-inter.exp: New file.
     * gcc.target/mips/inter/msa_1.h: New test.
     * gcc.target/mips/inter/msa_1_main.c: New test.
     * gcc.target/mips/inter/msa_1_x.c: New test.
     * gcc.target/mips/inter/msa_1_y.c: New test.
I don't guess Richard S's proposed cleanups to the generic handling of
SUBREGs to address issues with SVE really help here.
Could you send a link to the proposed patch?
I can't seem to find it on the mailing list.
I suspect you're going to need to define secondary reloads here much
like the aarch64 port does.  See aarch64_secondary_reload
I'm not sure that MSA memory accesses need secondary reloads, as MSA
doesn't have predicated moves. Instead, every memory load and store
is handled by LD.* and ST.* instructions.
I suspect you also need to twiddle mips_can_change_mode_class.
I've already made changes to mips_can_change_mode_class, to disallow
MSA mode change which results in a lane width change. Maybe you had
something else in mind?

I'm not sure I really understood yours and Richard's suggestions, so I
don't have a clear idea how to proceed with this patch. If you could
elaborate a bit more I would be grateful.

Thanks,
Mihailo

Reply via email to