On Fri, Jan 14, 2022 at 11:42 AM Andreas Krebbel via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On 1/14/22 08:37, Richard Biener wrote:
> ...
> > Can the gist of this bug be put into the GCC bugzilla so the rev can
> > refer to it?
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
>
> > Can we have a testcase even?
> The testcase from Jakub is in the BZ. However, since it doesn't fail with 
> head I didn't try to
> include it in my patch.

So I also noticed there was PR 101260 which had the exact same
analysis as this patch with respect to s390 and TI mode splitting and
regcprop.
So I marked PR 104034 as a duplicate of PR 101260.

Thanks,
Andrew

>
> > I'm not quite understanding the problem but is it that, say,
> >
> >  (subreg:DI (reg:V2DI ..) 0)
> >
> > isn't the same as
> >
> >  (lowpart:DI (reg:V2DI ...) 0)
>
> (reg:DI v0) does not match the lower order bits of (reg:TI v0)
>
> > ?  The regcprop code looks more like asking whether the larger reg
> > is a composition of multiple other hardregs and will return the specific
> > hardreg corresponding to the lowpart - so like if on s390 the vector
> > registers overlap with some other regset.  But then doing the actual
> > accesses via the other regset regs doesn't actually work?  Isn't the
> > backend then lying to us (aka the mode_change_ok returns the
> > wrong answer)?
>
> can_change_mode_class should do the right thing. We return false in case 
> somebody wants to change TI
> to DI for a vector register. However, the hook never gets called like this 
> from regcprop. regcprop
> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
> indeed ok.
>
> Before cprop we have:
>
> (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
>         (reg:TI 8 %r8)) -1
>      (nil))
>
>
> (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
>         (reg:DI 16 %f0)) 1277 {*movdi_64}
>      (nil))
>
>
> (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
>         (unspec:DI [
>                 (reg:V2DI 16 %f0)
>                 (const_int 1 [0x1])
>             ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
>      (expr_list:REG_DEAD (reg:V2DI 16 %f0)
>         (nil)))
>
> So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop 
> assuming that (reg:DI
> f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 
> %f0) with (reg:DI 9 %r9)
> what would be the DImode lowpart of (reg:TI r8)
>
> Insn 155 and 156 are the result of applying the following splitter:
>
> ; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
> ; For the higher order bits we do simply a DImode move while the
> ; second part is done via vec extract.  Both will end up as vlgvg.
> (define_split
>   [(set (match_operand:TI 0 "register_operand" "")
>         (match_operand:TI 1 "register_operand" ""))]
>   "TARGET_VX && reload_completed
>    && GENERAL_REG_P (operands[0])
>    && VECTOR_REG_P (operands[1])"
>   [(set (match_dup 2) (match_dup 4))
>    (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
>                                  UNSPEC_VEC_EXTRACT))]
> {
>   operands[2] = operand_subword (operands[0], 0, 0, TImode);
>   operands[3] = operand_subword (operands[0], 1, 0, TImode);
>   operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
>   operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
> })
>
> Introducing the (reg:DI 16 %f0) access to the TImode VR is something the 
> middle end is not expected
> to do - because we prevent it in can_change_mode_class. However, I don't see 
> anything wrong with
> doing that in the splitter. In our backend this is well-defined as being the 
> first element in the
> vector register - the high part of the TImode vector register value.
>
> Unfortunately it confuses cprop :(
>
> Andreas
>
>
>
> >
> > How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
> > be sure this hack isn't still present in 10 years from now?
> >
> > Thanks,
> > Richard.
> >
> >> Bootstrapped and regression-tested on s390x.
> >>
> >> Ok?
> >>
> >> gcc/ChangeLog:
> >>
> >>         * target.def (narrow_mode_refers_low_part_p): Add new target hook.
> >>         * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
> >>         Implement new target hook for IBM Z.
> >>         (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
> >>         * regcprop.c (maybe_mode_change): Disable transformation depending
> >>         on the new target hook.
> >> ---
> >>  gcc/config/s390/s390.c | 14 ++++++++++++++
> >>  gcc/regcprop.c         |  3 ++-
> >>  gcc/target.def         | 12 +++++++++++-
> >>  3 files changed, 27 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> >> index 056002e4a4a..aafc6d63be6 100644
> >> --- a/gcc/config/s390/s390.c
> >> +++ b/gcc/config/s390/s390.c
> >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
> >> machine_mode mode)
> >>    return false;
> >>  }
> >>
> >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
> >> +
> >> +static bool
> >> +s390_narrow_mode_refers_low_part_p (unsigned int regno)
> >> +{
> >> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
> >> +    return false;
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +
> >>  /* Implement TARGET_MODES_TIEABLE_P.  */
> >>
> >>  static bool
> >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, 
> >> rtx target, rtx op0, rtx op1,
> >>  #undef TARGET_VECTORIZE_VEC_PERM_CONST
> >>  #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const
> >>
> >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P
> >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P 
> >> s390_narrow_mode_refers_low_part_p
> >>
> >>  struct gcc_target targetm = TARGET_INITIALIZER;
> >>
> >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c
> >> index 1a9bcf0a1ad..aaf94ad9b51 100644
> >> --- a/gcc/regcprop.c
> >> +++ b/gcc/regcprop.c
> >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, 
> >> machine_mode copy_mode,
> >>
> >>    if (orig_mode == new_mode)
> >>      return gen_raw_REG (new_mode, regno);
> >> -  else if (mode_change_ok (orig_mode, new_mode, regno))
> >> +  else if (mode_change_ok (orig_mode, new_mode, regno)
> >> +          && targetm.narrow_mode_refers_low_part_p (regno))
> >>      {
> >>        int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
> >>        int use_nregs = hard_regno_nregs (copy_regno, new_mode);
> >> diff --git a/gcc/target.def b/gcc/target.def
> >> index 8fd2533e90a..598eea501ff 100644
> >> --- a/gcc/target.def
> >> +++ b/gcc/target.def
> >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.",
> >>   bool, (machine_mode from, machine_mode to, reg_class_t rclass),
> >>   hook_bool_mode_mode_reg_class_t_true)
> >>
> >> +/* This hook is used to work around a problem in regcprop. Hardcoded
> >> +assumptions currently prevent it from working correctly for targets
> >> +where the low part of a multi-word register doesn't align to accessing
> >> +the register with a narrower mode.  */
> >> +DEFHOOK_UNDOC
> >> +(narrow_mode_refers_low_part_p,
> >> +"",
> >> +bool, (unsigned int regno),
> >> +hook_bool_unit_true)
> >> +
> >>  /* Change pseudo allocno class calculated by IRA.  */
> >>  DEFHOOK
> >>  (ira_change_pseudo_allocno_class,
> >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being 
> >> done.  As long as the\n\
> >>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\
> >>  be used unless some pattern's constraint asks for one.",
> >>   bool, (unsigned int regno, machine_mode mode),
> >> - hook_bool_uint_mode_true)
> >> + hook_bool_uint_true)
> >>
> >>  DEFHOOK
> >>  (modes_tieable_p,
> >> --
> >> 2.33.1
> >>
>

Reply via email to