On 1/14/22 08:37, Richard Biener wrote:
...
> Can the gist of this bug be put into the GCC bugzilla so the rev can
> refer to it? 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

> Can we have a testcase even?
The testcase from Jakub is in the BZ. However, since it doesn't fail with head 
I didn't try to
include it in my patch.

> I'm not quite understanding the problem but is it that, say,
> 
>  (subreg:DI (reg:V2DI ..) 0)
> 
> isn't the same as
> 
>  (lowpart:DI (reg:V2DI ...) 0)

(reg:DI v0) does not match the lower order bits of (reg:TI v0)

> ?  The regcprop code looks more like asking whether the larger reg
> is a composition of multiple other hardregs and will return the specific
> hardreg corresponding to the lowpart - so like if on s390 the vector
> registers overlap with some other regset.  But then doing the actual
> accesses via the other regset regs doesn't actually work?  Isn't the
> backend then lying to us (aka the mode_change_ok returns the
> wrong answer)?

can_change_mode_class should do the right thing. We return false in case 
somebody wants to change TI
to DI for a vector register. However, the hook never gets called like this from 
regcprop. regcprop
only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
indeed ok.

Before cprop we have:

(insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
        (reg:TI 8 %r8)) -1
     (nil))


(insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
        (reg:DI 16 %f0)) 1277 {*movdi_64}
     (nil))


(insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
        (unspec:DI [
                (reg:V2DI 16 %f0)
                (const_int 1 [0x1])
            ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
     (expr_list:REG_DEAD (reg:V2DI 16 %f0)
        (nil)))

So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop 
assuming that (reg:DI
f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) 
with (reg:DI 9 %r9)
what would be the DImode lowpart of (reg:TI r8)

Insn 155 and 156 are the result of applying the following splitter:

; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
; For the higher order bits we do simply a DImode move while the
; second part is done via vec extract.  Both will end up as vlgvg.
(define_split
  [(set (match_operand:TI 0 "register_operand" "")
        (match_operand:TI 1 "register_operand" ""))]
  "TARGET_VX && reload_completed
   && GENERAL_REG_P (operands[0])
   && VECTOR_REG_P (operands[1])"
  [(set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
                                 UNSPEC_VEC_EXTRACT))]
{
  operands[2] = operand_subword (operands[0], 0, 0, TImode);
  operands[3] = operand_subword (operands[0], 1, 0, TImode);
  operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
  operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
})

Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle 
end is not expected
to do - because we prevent it in can_change_mode_class. However, I don't see 
anything wrong with
doing that in the splitter. In our backend this is well-defined as being the 
first element in the
vector register - the high part of the TImode vector register value.

Unfortunately it confuses cprop :(

Andreas



> 
> How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
> be sure this hack isn't still present in 10 years from now?
> 
> Thanks,
> Richard.
> 
>> Bootstrapped and regression-tested on s390x.
>>
>> Ok?
>>
>> gcc/ChangeLog:
>>
>>         * target.def (narrow_mode_refers_low_part_p): Add new target hook.
>>         * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
>>         Implement new target hook for IBM Z.
>>         (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
>>         * regcprop.c (maybe_mode_change): Disable transformation depending
>>         on the new target hook.
>> ---
>>  gcc/config/s390/s390.c | 14 ++++++++++++++
>>  gcc/regcprop.c         |  3 ++-
>>  gcc/target.def         | 12 +++++++++++-
>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
>> index 056002e4a4a..aafc6d63be6 100644
>> --- a/gcc/config/s390/s390.c
>> +++ b/gcc/config/s390/s390.c
>> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
>> machine_mode mode)
>>    return false;
>>  }
>>
>> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
>> +
>> +static bool
>> +s390_narrow_mode_refers_low_part_p (unsigned int regno)
>> +{
>> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
>> +    return false;
>> +
>> +  return true;
>> +}
>> +
>> +
>>  /* Implement TARGET_MODES_TIEABLE_P.  */
>>
>>  static bool
>> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, 
>> rtx target, rtx op0, rtx op1,
>>  #undef TARGET_VECTORIZE_VEC_PERM_CONST
>>  #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const
>>
>> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P
>> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P 
>> s390_narrow_mode_refers_low_part_p
>>
>>  struct gcc_target targetm = TARGET_INITIALIZER;
>>
>> diff --git a/gcc/regcprop.c b/gcc/regcprop.c
>> index 1a9bcf0a1ad..aaf94ad9b51 100644
>> --- a/gcc/regcprop.c
>> +++ b/gcc/regcprop.c
>> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode 
>> copy_mode,
>>
>>    if (orig_mode == new_mode)
>>      return gen_raw_REG (new_mode, regno);
>> -  else if (mode_change_ok (orig_mode, new_mode, regno))
>> +  else if (mode_change_ok (orig_mode, new_mode, regno)
>> +          && targetm.narrow_mode_refers_low_part_p (regno))
>>      {
>>        int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
>>        int use_nregs = hard_regno_nregs (copy_regno, new_mode);
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 8fd2533e90a..598eea501ff 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5446,6 +5446,16 @@ value that the middle-end intended.",
>>   bool, (machine_mode from, machine_mode to, reg_class_t rclass),
>>   hook_bool_mode_mode_reg_class_t_true)
>>
>> +/* This hook is used to work around a problem in regcprop. Hardcoded
>> +assumptions currently prevent it from working correctly for targets
>> +where the low part of a multi-word register doesn't align to accessing
>> +the register with a narrower mode.  */
>> +DEFHOOK_UNDOC
>> +(narrow_mode_refers_low_part_p,
>> +"",
>> +bool, (unsigned int regno),
>> +hook_bool_unit_true)
>> +
>>  /* Change pseudo allocno class calculated by IRA.  */
>>  DEFHOOK
>>  (ira_change_pseudo_allocno_class,
>> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being 
>> done.  As long as the\n\
>>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\
>>  be used unless some pattern's constraint asks for one.",
>>   bool, (unsigned int regno, machine_mode mode),
>> - hook_bool_uint_mode_true)
>> + hook_bool_uint_true)
>>
>>  DEFHOOK
>>  (modes_tieable_p,
>> --
>> 2.33.1
>>

Reply via email to