On 1/14/22 08:37, Richard Biener wrote: ... > Can the gist of this bug be put into the GCC bugzilla so the rev can > refer to it? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
> Can we have a testcase even? The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to include it in my patch. > I'm not quite understanding the problem but is it that, say, > > (subreg:DI (reg:V2DI ..) 0) > > isn't the same as > > (lowpart:DI (reg:V2DI ...) 0) (reg:DI v0) does not match the lower order bits of (reg:TI v0) > ? The regcprop code looks more like asking whether the larger reg > is a composition of multiple other hardregs and will return the specific > hardreg corresponding to the lowpart - so like if on s390 the vector > registers overlap with some other regset. But then doing the actual > accesses via the other regset regs doesn't actually work? Isn't the > backend then lying to us (aka the mode_change_ok returns the > wrong answer)? can_change_mode_class should do the right thing. We return false in case somebody wants to change TI to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok. Before cprop we have: (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69]) (reg:TI 8 %r8)) -1 (nil)) (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ]) (reg:DI 16 %f0)) 1277 {*movdi_64} (nil)) (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69]) (unspec:DI [ (reg:V2DI 16 %f0) (const_int 1 [0x1]) ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di} (expr_list:REG_DEAD (reg:V2DI 16 %f0) (nil))) So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop assuming that (reg:DI f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) with (reg:DI 9 %r9) what would be the DImode lowpart of (reg:TI r8) Insn 155 and 156 are the result of applying the following splitter: ; Split a VR -> GPR TImode move into 2 vector load GR from VR element. ; For the higher order bits we do simply a DImode move while the ; second part is done via vec extract. Both will end up as vlgvg. (define_split [(set (match_operand:TI 0 "register_operand" "") (match_operand:TI 1 "register_operand" ""))] "TARGET_VX && reload_completed && GENERAL_REG_P (operands[0]) && VECTOR_REG_P (operands[1])" [(set (match_dup 2) (match_dup 4)) (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)] UNSPEC_VEC_EXTRACT))] { operands[2] = operand_subword (operands[0], 0, 0, TImode); operands[3] = operand_subword (operands[0], 1, 0, TImode); operands[4] = gen_rtx_REG (DImode, REGNO (operands[1])); operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1])); }) Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle end is not expected to do - because we prevent it in can_change_mode_class. However, I don't see anything wrong with doing that in the splitter. In our backend this is well-defined as being the first element in the vector register - the high part of the TImode vector register value. Unfortunately it confuses cprop :( Andreas > > How does the stage1 fix, aka "rewrite" of cprop, look like? How can we > be sure this hack isn't still present in 10 years from now? > > Thanks, > Richard. > >> Bootstrapped and regression-tested on s390x. >> >> Ok? >> >> gcc/ChangeLog: >> >> * target.def (narrow_mode_refers_low_part_p): Add new target hook. >> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): >> Implement new target hook for IBM Z. >> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. >> * regcprop.c (maybe_mode_change): Disable transformation depending >> on the new target hook. >> --- >> gcc/config/s390/s390.c | 14 ++++++++++++++ >> gcc/regcprop.c | 3 ++- >> gcc/target.def | 12 +++++++++++- >> 3 files changed, 27 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c >> index 056002e4a4a..aafc6d63be6 100644 >> --- a/gcc/config/s390/s390.c >> +++ b/gcc/config/s390/s390.c >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, >> machine_mode mode) >> return false; >> } >> >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ >> + >> +static bool >> +s390_narrow_mode_refers_low_part_p (unsigned int regno) >> +{ >> + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) >> + return false; >> + >> + return true; >> +} >> + >> + >> /* Implement TARGET_MODES_TIEABLE_P. */ >> >> static bool >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, >> rtx target, rtx op0, rtx op1, >> #undef TARGET_VECTORIZE_VEC_PERM_CONST >> #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const >> >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P >> s390_narrow_mode_refers_low_part_p >> >> struct gcc_target targetm = TARGET_INITIALIZER; >> >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c >> index 1a9bcf0a1ad..aaf94ad9b51 100644 >> --- a/gcc/regcprop.c >> +++ b/gcc/regcprop.c >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode >> copy_mode, >> >> if (orig_mode == new_mode) >> return gen_raw_REG (new_mode, regno); >> - else if (mode_change_ok (orig_mode, new_mode, regno)) >> + else if (mode_change_ok (orig_mode, new_mode, regno) >> + && targetm.narrow_mode_refers_low_part_p (regno)) >> { >> int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); >> int use_nregs = hard_regno_nregs (copy_regno, new_mode); >> diff --git a/gcc/target.def b/gcc/target.def >> index 8fd2533e90a..598eea501ff 100644 >> --- a/gcc/target.def >> +++ b/gcc/target.def >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.", >> bool, (machine_mode from, machine_mode to, reg_class_t rclass), >> hook_bool_mode_mode_reg_class_t_true) >> >> +/* This hook is used to work around a problem in regcprop. Hardcoded >> +assumptions currently prevent it from working correctly for targets >> +where the low part of a multi-word register doesn't align to accessing >> +the register with a narrower mode. */ >> +DEFHOOK_UNDOC >> +(narrow_mode_refers_low_part_p, >> +"", >> +bool, (unsigned int regno), >> +hook_bool_unit_true) >> + >> /* Change pseudo allocno class calculated by IRA. */ >> DEFHOOK >> (ira_change_pseudo_allocno_class, >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being >> done. As long as the\n\ >> floating registers are not in class @code{GENERAL_REGS}, they will not\n\ >> be used unless some pattern's constraint asks for one.", >> bool, (unsigned int regno, machine_mode mode), >> - hook_bool_uint_mode_true) >> + hook_bool_uint_true) >> >> DEFHOOK >> (modes_tieable_p, >> -- >> 2.33.1 >>