On Fri, Jan 14, 2022 at 11:42 AM Andreas Krebbel via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On 1/14/22 08:37, Richard Biener wrote: > ... > > Can the gist of this bug be put into the GCC bugzilla so the rev can > > refer to it? > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034 > > > Can we have a testcase even? > The testcase from Jakub is in the BZ. However, since it doesn't fail with > head I didn't try to > include it in my patch.
So I also noticed there was PR 101260 which had the exact same analysis as this patch with respect to s390 and TI mode splitting and regcprop. So I marked PR 104034 as a duplicate of PR 101260. Thanks, Andrew > > > I'm not quite understanding the problem but is it that, say, > > > > (subreg:DI (reg:V2DI ..) 0) > > > > isn't the same as > > > > (lowpart:DI (reg:V2DI ...) 0) > > (reg:DI v0) does not match the lower order bits of (reg:TI v0) > > > ? The regcprop code looks more like asking whether the larger reg > > is a composition of multiple other hardregs and will return the specific > > hardreg corresponding to the lowpart - so like if on s390 the vector > > registers overlap with some other regset. But then doing the actual > > accesses via the other regset regs doesn't actually work? Isn't the > > backend then lying to us (aka the mode_change_ok returns the > > wrong answer)? > > can_change_mode_class should do the right thing. We return false in case > somebody wants to change TI > to DI for a vector register. However, the hook never gets called like this > from regcprop. regcprop > only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's > indeed ok. > > Before cprop we have: > > (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69]) > (reg:TI 8 %r8)) -1 > (nil)) > > > (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ]) > (reg:DI 16 %f0)) 1277 {*movdi_64} > (nil)) > > > (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69]) > (unspec:DI [ > (reg:V2DI 16 %f0) > (const_int 1 [0x1]) > ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di} > (expr_list:REG_DEAD (reg:V2DI 16 %f0) > (nil))) > > So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop > assuming that (reg:DI > f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 > %f0) with (reg:DI 9 %r9) > what would be the DImode lowpart of (reg:TI r8) > > Insn 155 and 156 are the result of applying the following splitter: > > ; Split a VR -> GPR TImode move into 2 vector load GR from VR element. > ; For the higher order bits we do simply a DImode move while the > ; second part is done via vec extract. Both will end up as vlgvg. > (define_split > [(set (match_operand:TI 0 "register_operand" "") > (match_operand:TI 1 "register_operand" ""))] > "TARGET_VX && reload_completed > && GENERAL_REG_P (operands[0]) > && VECTOR_REG_P (operands[1])" > [(set (match_dup 2) (match_dup 4)) > (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)] > UNSPEC_VEC_EXTRACT))] > { > operands[2] = operand_subword (operands[0], 0, 0, TImode); > operands[3] = operand_subword (operands[0], 1, 0, TImode); > operands[4] = gen_rtx_REG (DImode, REGNO (operands[1])); > operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1])); > }) > > Introducing the (reg:DI 16 %f0) access to the TImode VR is something the > middle end is not expected > to do - because we prevent it in can_change_mode_class. However, I don't see > anything wrong with > doing that in the splitter. In our backend this is well-defined as being the > first element in the > vector register - the high part of the TImode vector register value. > > Unfortunately it confuses cprop :( > > Andreas > > > > > > > How does the stage1 fix, aka "rewrite" of cprop, look like? How can we > > be sure this hack isn't still present in 10 years from now? > > > > Thanks, > > Richard. > > > >> Bootstrapped and regression-tested on s390x. > >> > >> Ok? > >> > >> gcc/ChangeLog: > >> > >> * target.def (narrow_mode_refers_low_part_p): Add new target hook. > >> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): > >> Implement new target hook for IBM Z. > >> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. > >> * regcprop.c (maybe_mode_change): Disable transformation depending > >> on the new target hook. > >> --- > >> gcc/config/s390/s390.c | 14 ++++++++++++++ > >> gcc/regcprop.c | 3 ++- > >> gcc/target.def | 12 +++++++++++- > >> 3 files changed, 27 insertions(+), 2 deletions(-) > >> > >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > >> index 056002e4a4a..aafc6d63be6 100644 > >> --- a/gcc/config/s390/s390.c > >> +++ b/gcc/config/s390/s390.c > >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, > >> machine_mode mode) > >> return false; > >> } > >> > >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ > >> + > >> +static bool > >> +s390_narrow_mode_refers_low_part_p (unsigned int regno) > >> +{ > >> + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) > >> + return false; > >> + > >> + return true; > >> +} > >> + > >> + > >> /* Implement TARGET_MODES_TIEABLE_P. */ > >> > >> static bool > >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, > >> rtx target, rtx op0, rtx op1, > >> #undef TARGET_VECTORIZE_VEC_PERM_CONST > >> #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const > >> > >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P > >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P > >> s390_narrow_mode_refers_low_part_p > >> > >> struct gcc_target targetm = TARGET_INITIALIZER; > >> > >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c > >> index 1a9bcf0a1ad..aaf94ad9b51 100644 > >> --- a/gcc/regcprop.c > >> +++ b/gcc/regcprop.c > >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, > >> machine_mode copy_mode, > >> > >> if (orig_mode == new_mode) > >> return gen_raw_REG (new_mode, regno); > >> - else if (mode_change_ok (orig_mode, new_mode, regno)) > >> + else if (mode_change_ok (orig_mode, new_mode, regno) > >> + && targetm.narrow_mode_refers_low_part_p (regno)) > >> { > >> int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); > >> int use_nregs = hard_regno_nregs (copy_regno, new_mode); > >> diff --git a/gcc/target.def b/gcc/target.def > >> index 8fd2533e90a..598eea501ff 100644 > >> --- a/gcc/target.def > >> +++ b/gcc/target.def > >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.", > >> bool, (machine_mode from, machine_mode to, reg_class_t rclass), > >> hook_bool_mode_mode_reg_class_t_true) > >> > >> +/* This hook is used to work around a problem in regcprop. Hardcoded > >> +assumptions currently prevent it from working correctly for targets > >> +where the low part of a multi-word register doesn't align to accessing > >> +the register with a narrower mode. */ > >> +DEFHOOK_UNDOC > >> +(narrow_mode_refers_low_part_p, > >> +"", > >> +bool, (unsigned int regno), > >> +hook_bool_unit_true) > >> + > >> /* Change pseudo allocno class calculated by IRA. */ > >> DEFHOOK > >> (ira_change_pseudo_allocno_class, > >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being > >> done. As long as the\n\ > >> floating registers are not in class @code{GENERAL_REGS}, they will not\n\ > >> be used unless some pattern's constraint asks for one.", > >> bool, (unsigned int regno, machine_mode mode), > >> - hook_bool_uint_mode_true) > >> + hook_bool_uint_true) > >> > >> DEFHOOK > >> (modes_tieable_p, > >> -- > >> 2.33.1 > >> >