On Tue, 17 Jan 2017, Richard Henderson wrote: > On 01/17/2017 01:07 AM, Kirill Batuzov wrote: > > To be able to generate vector operations in a TCG backend we need to do > > several things. > > > > 1. We need to tell the register allocator about vector target's register. > > In case of x86 we'll use xmm0..xmm7. xmm7 is designated as a scratch > > register, others can be used by the register allocator. > > > > 2. We need a new constraint to indicate where to use vector registers. In > > this commit the 'V' constraint is introduced. > > > > 3. We need to be able to generate bare minimum: load, store and reg-to-reg > > move. MOVDQU is used for loads and stores. MOVDQA is used for reg-to-reg > > moves. > > > > 4. Finally we need to support any other opcodes we want. INDEX_op_add_i32x4 > > is the only one for now. The PADDD instruction handles it perfectly. > > > > Signed-off-by: Kirill Batuzov <batuz...@ispras.ru> > > --- > > tcg/i386/tcg-target.h | 24 +++++++++- > > tcg/i386/tcg-target.inc.c | 109 > > +++++++++++++++++++++++++++++++++++++++++++--- > > 2 files changed, 125 insertions(+), 8 deletions(-) > > > > diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h > > index 524cfc6..974a58b 100644 > > --- a/tcg/i386/tcg-target.h > > +++ b/tcg/i386/tcg-target.h > > @@ -29,8 +29,14 @@ > > #define TCG_TARGET_TLB_DISPLACEMENT_BITS 31 > > > > #ifdef __x86_64__ > > -# define TCG_TARGET_REG_BITS 64 > > -# define TCG_TARGET_NB_REGS 16 > > +# define TCG_TARGET_HAS_REG128 1 > > +# ifdef TCG_TARGET_HAS_REG128 > > +# define TCG_TARGET_REG_BITS 64 > > +# define TCG_TARGET_NB_REGS 24 > > +# else > > +# define TCG_TARGET_REG_BITS 64 > > +# define TCG_TARGET_NB_REGS 16 > > +# endif > > #else > > # define TCG_TARGET_REG_BITS 32 > > # define TCG_TARGET_NB_REGS 8 > > @@ -56,6 +62,16 @@ typedef enum { > > TCG_REG_R13, > > TCG_REG_R14, > > TCG_REG_R15, > > +#ifdef TCG_TARGET_HAS_REG128 > > + TCG_REG_XMM0, > > + TCG_REG_XMM1, > > + TCG_REG_XMM2, > > + TCG_REG_XMM3, > > + TCG_REG_XMM4, > > + TCG_REG_XMM5, > > + TCG_REG_XMM6, > > + TCG_REG_XMM7, > > +#endif > > There's no need to conditionalize this. The registers can be always defined > even if they're not used. We really really really want to keep ifdefs to an > absolute minimum. > > Why are you not defining xmm8-15?
At first I thought about supporting both x86_64 and i386 targets, but put this idea away (at least for the time being). Since defining xmm8-15 does not contradict anything (as I see it now) I'll add them too. > > > @@ -634,9 +662,24 @@ static inline void tgen_arithr(TCGContext *s, int > > subop, int dest, int src) > > static inline void tcg_out_mov(TCGContext *s, TCGType type, > > TCGReg ret, TCGReg arg) > > { > > + int opc; > > if (arg != ret) { > > - int opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0); > > - tcg_out_modrm(s, opc, ret, arg); > > + switch (type) { > > +#ifdef TCG_TARGET_HAS_REG128 > > + case TCG_TYPE_V128: > > + ret -= TCG_REG_XMM0; > > + arg -= TCG_REG_XMM0; > > + tcg_out_modrm(s, OPC_MOVDQA_R2R, ret, arg); > > + break; > > +#endif > > + case TCG_TYPE_I32: > > + case TCG_TYPE_I64: > > + opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0); > > + tcg_out_modrm(s, opc, ret, arg); > > + break; > > + default: > > + assert(0); > > g_assert_not_reached(). > > Again, no ifdefs. > > We probably want to generate avx1 code when the cpu supports it, to avoid mode > switches in the vector registers. In this case, simply issue the same opcode, > vex encoded. > > > +#ifdef TCG_TARGET_HAS_REG128 > > + { INDEX_op_add_i32x4, { "V", "0", "V" } }, > > +#endif > > And, clearly, you need to rebase. > I was too late to notice that some conflicting tcg-related pull has hit master after my last rebase. Sorry. v2 will be rebased. -- Kirill