On Wed, May 29, 2013 at 03:53:43PM -0400, David Edelsohn wrote:
> +      if (TARGET_DIRECT_MOVE)
> +        {
> +          if (TARGET_POWERPC64)
> +        {
> +          reload_gpr_vsx[TImode]    = CODE_FOR_reload_gpr_from_vsxti;
> +          reload_gpr_vsx[V2DFmode]  = CODE_FOR_reload_gpr_from_vsxv2df;
> +          reload_gpr_vsx[V2DImode]  = CODE_FOR_reload_gpr_from_vsxv2di;
> +          reload_gpr_vsx[V4SFmode]  = CODE_FOR_reload_gpr_from_vsxv4sf;
> +          reload_gpr_vsx[V4SImode]  = CODE_FOR_reload_gpr_from_vsxv4si;
> +          reload_gpr_vsx[V8HImode]  = CODE_FOR_reload_gpr_from_vsxv8hi;
> +          reload_gpr_vsx[V16QImode] = CODE_FOR_reload_gpr_from_vsxv16qi;
> +          reload_gpr_vsx[SFmode]    = CODE_FOR_reload_gpr_from_vsxsf;
> +
> +          reload_vsx_gpr[TImode]    = CODE_FOR_reload_vsx_from_gprti;
> +          reload_vsx_gpr[V2DFmode]  = CODE_FOR_reload_vsx_from_gprv2df;
> +          reload_vsx_gpr[V2DImode]  = CODE_FOR_reload_vsx_from_gprv2di;
> +          reload_vsx_gpr[V4SFmode]  = CODE_FOR_reload_vsx_from_gprv4sf;
> +          reload_vsx_gpr[V4SImode]  = CODE_FOR_reload_vsx_from_gprv4si;
> +          reload_vsx_gpr[V8HImode]  = CODE_FOR_reload_vsx_from_gprv8hi;
> +          reload_vsx_gpr[V16QImode] = CODE_FOR_reload_vsx_from_gprv16qi;
> +          reload_vsx_gpr[SFmode]    = CODE_FOR_reload_vsx_from_gprsf;
> +        }
> +          else
> +        {
> +          reload_fpr_gpr[DImode] = CODE_FOR_reload_fpr_from_gprdi;
> +          reload_fpr_gpr[DDmode] = CODE_FOR_reload_fpr_from_gprdd;
> +          reload_fpr_gpr[DFmode] = CODE_FOR_reload_fpr_from_gprdf;
> +        }
> +        }
> 
> Why do the VSX reload functions depend on TARGET_POWERPC64? That seems
> like the wrong test.

Because at present we do not do direct move between VSX and GPR registers for
128-bit in 32-bit mode.  For 32-bit mode, we only transfer 64-bit types.

Due to issues with secondary reload where you only get one temporary register,
and that temporary register might/might not overlap with the output register,
we might need a type that takes 3 traditional FPR registers, and we don't have
one at present (to move from GPR to VSX we would need to do 2 direct moves, a
FMRGOW for the first 64-bits, and 2 direct moves, and another FMRGOW for the
second 64-bits, and then an XXPERMDI to glue the two 64-bits together).  I've
seen places where reload does not honor the & constraint for these secondary
reload cases, so the output can't be one of the temporary registers, hence
needing a type that spans 3 registers.

In theory it can be done, it just hasn't been done.

> +/* Return true if this is a move direct operation between GPR registers and
> +   floating point/VSX registers.  */
> +
> +bool
> +direct_move_p (rtx op0, rtx op1)
> 
> Why isn't this function symmetric?  It at least needs an explanation
> in the comment about assumptions for the operands.

I probably should have named the operands to and from, since that is how the
callers call it.  I'm not sure I understand the comment about it being
symetric, since you need different strategies to move in different directions,
and you might or might not have implemented all of the cases.

> +/* Return true if this is a load or store quad operation.  */
> +
> +bool
> +quad_load_store_p (rtx op0, rtx op1)
> 
> Same for this function.

Again it should have been to/from.

> +/* Helper function for rs6000_secondary_reload to return true if a move to a
> +   different register classe is really a simple move.  */
> +
> +static bool
> +rs6000_secondary_reload_simple_move (enum rs6000_reg_type to_type,
> +                     enum rs6000_reg_type from_type,
> +                     enum machine_mode mode)
> +{
> +  int size;
> +
> +  /* Add support for various direct moves available.  In this function, we 
> only
> +     look at cases where we don't need any extra registers, and one or more
> +     simple move insns are issued.  At present, 32-bit integers are not 
> allowed
> +     in FPR/VSX registers.  Single precision binary floating is not a simple
> +     move because we need to convert to the single precision memory layout.
> +     The 4-byte SDmode can be moved.  */
> 
> The second comment should be merged into the first -- it explains the
> purpose and implementation of the function.

Ok.

> +/* Return whether a move between two register classes can be done either
> +   directly (simple move) or via a pattern that uses a single extra temporary
> +   (using power8's direct move in this case.  */
> +
> +static bool
> +rs6000_secondary_reload_move (enum rs6000_reg_type to_type,
> +                  enum rs6000_reg_type from_type,
> +                  enum machine_mode mode,
> +                  secondary_reload_info *sri,
> +                  bool altivec_p)
> 
> Missing a close parenthesis in the comment.

Yep, thanks.

> (define_insn "*vsx_mov<mode>"
> -  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
> "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*Y,*r,*r,<VSr>,?wa,*r,v,wZ,v")
> -    (match_operand:VSX_M 1 "input_operand"
> "<VSr>,Z,<VSr>,wa,Z,wa,r,Y,r,j,j,j,W,v,wZ"))]
> +  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
> "=Z,<VSr>,<VSr>,?Z,?wa,?wa,wQ,?&r,??Y,??r,??r,<VSr>,?wa,*r,v,wZ, v")
> +    (match_operand:VSX_M 1 "input_operand"
> "<VSr>,Z,<VSr>,wa,Z,wa,r,wQ,r,Y,r,j,j,j,W,v,wZ"))]
> 
> Why do you need to change the modifiers? Why should vector operands in
> GPRs matter for register preferences (removing `*' from "r"
> constraints)?

The problem is if you use '*', the quad word atomics will either get aborts or
load stuff up into vector registers and then do direct moves, since it will
never allocate vector registers to the GPRs.  We will also have the problem
when the 128-bit add/subtract in vector register support is written.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to