Re: [Patch, RTL] Eliminate redundant vec_select moves.

H.J. Lu Wed, 11 Dec 2013 08:35:21 -0800

On Wed, Dec 11, 2013 at 8:26 AM, Tejas Belagod <tbela...@arm.com> wrote:
> H.J. Lu wrote:
>>
>> On Wed, Dec 11, 2013 at 7:49 AM, Richard Sandiford
>> <rdsandif...@googlemail.com> wrote:
>>>
>>> "H.J. Lu" <hjl.to...@gmail.com> writes:
>>>>
>>>> On Wed, Dec 11, 2013 at 1:13 AM, Richard Sandiford
>>>> <rdsandif...@googlemail.com> wrote:
>>>>>
>>>>> Richard Henderson <r...@redhat.com> writes:
>>>>>>
>>>>>> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>>>>>>>
>>>>>>> Sorry, I don't understand.  I never said it was invalid.  I said
>>>>>>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>>>>>>> a single register.  On a little-endian target, the offset cannot be
>>>>>>> anything other than 0 in that case.
>>>>>>>
>>>>>>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>>>>>>> something that is always invalid, regardless of the target.  That
>>>>>>> kind
>>>>>>> of situation should be rejected by target-independent code instead.
>>>>>>
>>>>>> But, we want to disable the subreg before we know whether or not
>>>>>> (reg:V4SF X)
>>>>>> will be allocated to a single hard register.  That is something that
>>>>>> we can't
>>>>>> know in target-independent code before register allocation.
>>>>>
>>>>> I was thinking that if we've got a class, we've also got things like
>>>>> CLASS_MAX_NREGS.  Maybe that doesn't cope with padding properly though.
>>>>> But even in the padding cases an offset-based check in C_C_M_C could
>>>>> be derived from other information.
>>>>>
>>>>> subreg_get_info handles padding with:
>>>>>
>>>>>       nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode);
>>>>>       if (GET_MODE_INNER (xmode) == VOIDmode)
>>>>>         xmode_unit = xmode;
>>>>>       else
>>>>>         xmode_unit = GET_MODE_INNER (xmode);
>>>>>       gcc_assert (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode_unit));
>>>>>       gcc_assert (nregs_xmode
>>>>>                   == (GET_MODE_NUNITS (xmode)
>>>>>                       * HARD_REGNO_NREGS_WITH_PADDING (xregno,
>>>>> xmode_unit)));
>>>>>       gcc_assert (hard_regno_nregs[xregno][xmode]
>>>>>                   == (hard_regno_nregs[xregno][xmode_unit]
>>>>>                       * GET_MODE_NUNITS (xmode)));
>>>>>
>>>>>       /* You can only ask for a SUBREG of a value with holes in the
>>>>> middle
>>>>>          if you don't cross the holes.  (Such a SUBREG should be done
>>>>> by
>>>>>          picking a different register class, or doing it in memory if
>>>>>          necessary.)  An example of a value with holes is XCmode on
>>>>> 32-bit
>>>>>          x86 with -m128bit-long-double; it's represented in 6 32-bit
>>>>> registers,
>>>>>          3 for each part, but in memory it's two 128-bit parts.
>>>>>          Padding is assumed to be at the end (not necessarily the 'high
>>>>> part')
>>>>>          of each unit.  */
>>>>>       if ((offset / GET_MODE_SIZE (xmode_unit) + 1
>>>>>            < GET_MODE_NUNITS (xmode))
>>>>>           && (offset / GET_MODE_SIZE (xmode_unit)
>>>>>               != ((offset + GET_MODE_SIZE (ymode) - 1)
>>>>>                   / GET_MODE_SIZE (xmode_unit))))
>>>>>         {
>>>>>           info->representable_p = false;
>>>>>           rknown = true;
>>>>>         }
>>>>>
>>>>> and I wouldn't really want to force targets to individually reproduce
>>>>> that kind of logic at the class level.  If the worst comes to the worst
>>>>> we could cache the difficult cases.
>>>>>
>>>> My case is x86 CANNOT_CHANGE_MODE_CLASS only needs
>>>> to know if the subreg byte is zero or not.  It doesn't care about mode
>>>> padding.  You are concerned about information passed to
>>>> CANNOT_CHANGE_MODE_CLASS is too expensive for target
>>>> to process.  It isn't the case for x86.
>>>
>>> No, I'm concerned that by going this route, we're forcing every target
>>> (or at least every target with wider-than-word registers, which is most
>>> of the common ones) to implement the same target-independent restriction.
>>> This is not an x86-specific issue.
>>>
>>
>> So you prefer a generic solution which makes
>> CANNOT_CHANGE_MODE_CLASS return true
>> for vector mode subreg if subreg byte != 0. Is this
>> correct?
>
>
> Do you mean a generic solution for C_C_M_C to return true for non-zero
> byte_offset vector subregs in the context of x86?
>
> I want to clarify because in the context of 32-bit ARM little-endian, a
> non-zero byte-offset vector subreg is still a valid full hardreg. eg. for
>
>    (subreg:DI (reg:V4SF) 8)
>
> C_C_M_C can return 'false' as this can be resolved to a full D-reg.
>


Does that mean subreg byte interpretation is endian-dependent?
Both llittle endian

subreg:DI (reg:V4SF) 0)

and big endian

subreg:DI (reg:V4SF) MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT)

refer to the same lower 64 bits of reg:V4SF.  Is this correct?

-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

Reply via email to