On Thu, Aug 7, 2014 at 2:41 PM, Michael Meissner <meiss...@linux.vnet.ibm.com> wrote: > I'm starting to look at updating my old address branch with an eye towards > getting the changes committed in GCC 4.10. The address branch is meant to > rewrite handling of addresses in the rs6000 backend, to generalize the > addresses before register allocation to allow more general forms, and then > within register allocation, be more specific, depending on the register > classes > involved. > > The goals of the address work are: > > 1) On power7, enable using the traditional Altivec registers to hold > double precision floating point; > > 2) On power8, enable using the traditional Altivec registers to hold > single precision floating point; > > 3) On power8, enable better fusion support for loads, by keeping the > extended addressing forms together, so more fused instructions can be > emitted. > > I believe that for the first two features, we will also need to flip the > switch > on LRA, to make it default, in order to better deal with having reg+offset > addressing in the traditional floating point registers, but only reg+reg > addressing in the traditional Altivec registers. > > These patches primarily tighten the constraints, so that the "wa" constraing > (all VSX registers) is not used on types that are not allowed in the > traditional Altivec registers. This can throw off the pressure sensitive > optimizations like -fsched-pressure because the pass thinks there are more > registers available than could be allocated. > > I also have a simplification in rs6000.c that makes it easier to decide if > scalar values can go in traditional Altivec registers in the patch. > > I have done bootstrap and C/C++ regression testing on power7, power8 big > endian, and power8 little endian systems with no regressions. At this time, I > have not run the fortran regression tests, since there are various failures > due > to memory allocation in the trunk when I tested it (bugzilla 61950). Are > these > patches ok to be installed in the trunk? I would like to install them in 4.9 > and 4.8 trees as well. > > 2014-08-07 Michael Meissner <meiss...@linux.vnet.ibm.com> > > * config/rs6000/constraints.md (wh constraint): New constraint, > for FP registers if direct move is available. > (wi constraint): New constraint, for VSX/FP registers that can > handle 64-bit integers. > (wj constraint): New constraint for VSX/FP registers that can > handle 64-bit integers for direct moves. > (wk constraint): New constraint for VSX/FP registers that can > handle 64-bit doubles for direct moves. > (wy constraint): Make documentation match implementation. > > * config/rs6000/rs6000.c (struct rs6000_reg_addr): Add > scalar_in_vmx_p field to simplify tests of whether SFmode or > DFmode can go in the Altivec registers. > (rs6000_hard_regno_mode_ok): Use scalar_in_vmx_p field. > (rs6000_setup_reg_addr_masks): Likewise. > (rs6000_debug_print_mode): Add debug support for scalar_in_vmx_p > field, and wh/wi/wj/wk constraints. > (rs6000_init_hard_regno_mode_ok): Setup scalar_in_vmx_p field, and > the wh/wi/wj/wk constraints. > (rs6000_preferred_reload_class): If SFmode/DFmode can go in the > upper registers, prefer VSX registers unless the operation is a > memory operation with REG+OFFSET addressing. > > * config/rs6000/vsx.md (VSr mode attribute): Add support for > DImode. Change SFmode to use ww constraint instead of d to allow > SF registers in the upper registers. > (VSr2): Likewise. > (VSr3): Likewise. > (VSr5): Fix thinko in comment. > (VSa): New mode attribute that is an alternative to wa, that > returns the VSX register class that a mode can go in, but may not > be the preferred register class. > (VS_64dm): New mode attribute for appropriate register classes for > referencing 64-bit elements of vectors for direct moves and normal > moves. > (VS_64reg): Likewise. > (vsx_mov<mode>): Change wa constraint to <VSa> to limit the > register allocator to only registers the data type can handle. > (vsx_le_perm_load_<mode>): Likewise. > (vsx_le_perm_store_<mode>): Likewise. > (vsx_xxpermdi2_le_<mode>): Likewise. > (vsx_xxpermdi4_le_<mode>): Likewise. > (vsx_lxvd2x2_le_<mode>): Likewise. > (vsx_lxvd2x4_le_<mode>): Likewise. > (vsx_stxvd2x2_le_<mode>): Likewise. > (vsx_add<mode>3): Likewise. > (vsx_sub<mode>3): Likewise. > (vsx_mul<mode>3): Likewise. > (vsx_div<mode>3): Likewise. > (vsx_tdiv<mode>3_internal): Likewise. > (vsx_fre<mode>2): Likewise. > (vsx_neg<mode>2): Likewise. > (vsx_abs<mode>2): Likewise. > (vsx_nabs<mode>2): Likewise. > (vsx_smax<mode>3): Likewise. > (vsx_smin<mode>3): Likewise. > (vsx_sqrt<mode>2): Likewise. > (vsx_rsqrte<mode>2): Likewise. > (vsx_tsqrt<mode>2_internal): Likewise. > (vsx_fms<mode>4): Likewise. > (vsx_nfma<mode>4): Likewise. > (vsx_eq<mode>): Likewise. > (vsx_gt<mode>): Likewise. > (vsx_ge<mode>): Likewise. > (vsx_eq<mode>_p): Likewise. > (vsx_gt<mode>_p): Likewise. > (vsx_ge<mode>_p): Likewise. > (vsx_xxsel<mode>): Likewise. > (vsx_xxsel<mode>_uns): Likewise. > (vsx_copysign<mode>3): Likewise. > (vsx_float<VSi><mode>2): Likewise. > (vsx_floatuns<VSi><mode>2): Likewise. > (vsx_fix_trunc<mode><VSi>2): Likewise. > (vsx_fixuns_trunc<mode><VSi>2): Likewise. > (vsx_x<VSv>r<VSs>i): Likewise. > (vsx_x<VSv>r<VSs>ic): Likewise. > (vsx_btrunc<mode>2): Likewise. > (vsx_b2trunc<mode>2): Likewise. > (vsx_floor<mode>2): Likewise. > (vsx_ceil<mode>2): Likewise. > (vsx_<VS_spdp_insn>): Likewise. > (vsx_xscvspdp): Likewise. > (vsx_xvcvspuxds): Likewise. > (vsx_float_fix_<mode>2): Likewise. > (vsx_set_<mode>): Likewise. > (vsx_extract_<mode>_internal1): Likewise. > (vsx_extract_<mode>_internal2): Likewise. > (vsx_extract_<mode>_load): Likewise. > (vsx_extract_<mode>_store): Likewise. > (vsx_splat_<mode>): Likewise. > (vsx_xxspltw_<mode>): Likewise. > (vsx_xxspltw_<mode>_direct): Likewise. > (vsx_xxmrghw_<mode>): Likewise. > (vsx_xxmrglw_<mode>): Likewise. > (vsx_xxsldwi_<mode>): Likewise. > (vsx_xscvdpspn): Tighten constraints to only use register classes > the types use. > (vsx_xscvspdpn): Likewise. > (vsx_xscvdpspn_scalar): Likewise. > > * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wh, wi, > wj, and wk constraints. > (GPR_REG_CLASS_P): New helper macro for register classes targeting > general purpose registers. > > * config/rs6000/rs6000.md (f32_dm): Use wh constraint for SDmode > direct moves. > (zero_extendsidi2_lfiwz): Use wj constraint for direct move of > DImode instead of wm. Use wk constraint for direct move of DFmode > instead of wm. > (extendsidi2_lfiwax): Likewise. > (lfiwax): Likewise. > (lfiwzx): Likewise. > (movdi_internal64): Likewise. > > * doc/md.texi (PowerPC and IBM RS6000): Document wh, wi, wj, and > wk constraints. Make the wy constraint documentation match them > implementation.
Okay. Note that you or Carrot will have to be careful about the merge of movdi_internal64 as both of your patches affect that pattern. Thanks, David