Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

Richard Sandiford Wed, 14 Feb 2024 12:51:36 -0800

Ajit Agarwal <aagar...@linux.ibm.com> writes:
> Hello Richard:
>
>
> On 14/02/24 10:45 pm, Richard Sandiford wrote:
>> Ajit Agarwal <aagar...@linux.ibm.com> writes:
>>>>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>>>>> index 1856fa4884f..ffc47a6eaa0 100644
>>>>> --- a/gcc/emit-rtl.cc
>>>>> +++ b/gcc/emit-rtl.cc
>>>>> @@ -921,7 +921,7 @@ validate_subreg (machine_mode omode, machine_mode 
>>>>> imode,
>>>>>      return false;
>>>>>  
>>>>>    /* The subreg offset cannot be outside the inner object.  */
>>>>> -  if (maybe_ge (offset, isize))
>>>>> +  if (maybe_gt (offset, isize))
>>>>>      return false;
>>>>
>>>> Can you explain why this change is needed?
>>>>
>>>
>>> This is required in rs6000 target where we generate the subreg
>>> with offset 16 from OO mode (256 bit) to 128 bit vector modes.
>>> Otherwise it segfaults.
>> 
>> Could you go into more detail?  Why does that subreg lead to a segfault?
>> 
>> In itself, a 16-byte subreg at byte offset 16 into a 32-byte pair is pretty
>> standard.  AArch64 uses this too for its vector load/store pairs (and for
>> structure pairs more generally).
>> 
>
> If we want to create subreg V16QI (reg OO R) 16) imode is V16QI (isize = 16) 
> and offset 
> is 16. maybe_ge (offset, isize) return true and validate_subreg returns false;


isize is supposed to be the size of the "inner mode", which in this
case is OO.  Since OO is a 32-bit mode, I would expect isize to be 32
rather than 16.  Is that not the case?

Or is the problem that something is trying to take a subreg of a subreg?
If so, that is only valid in certain cases.  It isn't for example valid
to use a subreg operation to move between (subreg:V16QI (reg:OO X) 16)
and (subreg:V16QI (reg:OO X) 0).

Thanks,
Richard

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

Reply via email to