Re: [trunk] Addition to subreg section of rtl.text.

Richard Sandiford Thu, 20 Mar 2008 10:30:19 -0700

Joern Rennecke <[EMAIL PROTECTED]> writes:
> On Thu, Mar 20, 2008 at 10:39:47AM +0000, Richard Sandiford wrote:
>> you're saying that, for any valid values of M and X:
>> 
>>   (set (subreg:M (reg:N ...) X) (const_int 0))
>> 
>> does not guarantee that (subreg:M (reg:N ...) ...) has the value 0
>> if N is a partial mode?
>
> Yes.  Although it will be more common for 1 bits to change to zero, some bits
> might actually differ between successive reads, when these bits are status
> flags.  AFAICR there was even some processor that had a level-sensitive
> I/O port bit mapped in its status register.


Well, the part you've quoted was only an example of the earlier,
more general statement:

   ... partial modes behave "as if" their widths were rounded up to the
   next word boundary, but that an unspecified collection of bits in the
   extended width will read as undefined

which I think covers your case too.  It was more the general statement
that I was interested in, because it would form the basis of an updated
rtl.texi patch.

>> > I think some of the rules are overly restrictive, and prevent gcc
>> > from archiving its full potential for generating efficient code.
>> > Moreover, if a port has an extv / insv pattern that matches in mode with 
>> > the
>> > wide registers, it can legitimately use the zero_extract route.  It's
>> > reload that contradicts the documentation in changing registers into MEMs
>> > and thus creating zero_extracts from wide MEMs.
>> 
>> It sounds like you might be referring to both the subreg and extract
>> documentation here.  As far as the subreg documentation goes,
>> let's assume that what I said above about partial modes is right
>> (you'll have already corrected me by now if not).  If we change the
>> rules to say that, what do you think is still overly restrictive?
>
> - zero_extract officially only allowed for a specific mode.
> - nested subregs not allowed, but neither are all subregs that
>   would result from substituting a subreg into an inner reg of another
>   subreg and simplifying allowed.
> - highpart subregs not allowed (e.g. consider SH64 floating point registers:
>   word_mode is 64, but the floating point registers are 32 bit.  How do
>   you refer to the high part of a DFmode value, considering that the
>   inner reg might be allocated to a floating value.  (Actually, generally
>   want such an allocation).)

Sorry, my question was unclear.  I really meant: what in the proposed
rtl.texi rules is too restrictive _given what the current code is
supposed to allow_?  We're trying to write down what is currently
folklore, and we're trying to figure out if we've expressed the
folklore correctly.  Whereas the things you've listed above are not
supported by the current code; they're possible future extensions.

>> E.g. one possibility would be to drop:
>> 
>>     If @var{reg} is a hard register, the @code{subreg} must also represent
>>     the lowpart of a particular hard register, or represent one or more
>>     complete hard registers.
>> 
>> and instead say that the word-based semantics for pseudo registers also
>> apply to hard registers, regardless of the number of hard registers in
>> the inner register.  This would in some ways be simpler.
>
> Yes.  If subword-writing semantics are wanted, and the SUBREG represents
> an actual hard register, than the port has to use a proper hard reg instead.
> And it makes semantics much saner when we do register allocation for a
> pseudo where we don't know the register size to start with.

Thanks.  So in summary, it's OK for these rules in the proposed
rtl.texi patch:

    When @var{m1} is at least as narrow as @var{m2} the @code{subreg}
    expressions is called @dfn{normal}.

    Normal subregs restrict consideration to certain bits of @var{reg}.
    There are two cases.  If @var{m1} is smaller than a word, the
    @code{subreg} refers to the least-significant part (or @dfn{lowpart})
    of one word of @var{reg}.  If @var{m1} is word-sized or greater, the
    @code{subreg} refers to one or more complete words.

    When @var{m2} is larger than a word, the subreg is a @dfn{multi-word
    outer subreg}.  When used as an lvalue, @code{subreg} is a word-based
    accessor.  Storing to a @code{subreg} modifies all the words of
    @var{reg} that overlap the @code{subreg}, but it leaves the other
    words of @var{reg} alone.

    When storing to a normal @code{subreg} that is smaller than a word,
    the other bits of the referenced word are usually left in an undefined
    state.  This laxity makes it easier to generate efficient code for
    such instructions.  To represent an instruction that preserves all the
    bits outside of those in the @code{subreg}, use @code{strict_low_part}
    or @code{zero_extract} around the @code{subreg}.

to apply to hard registers as well as pseudos, without the additional
restriction:

    If @var{reg} is a hard register, the @code{subreg} must also represent
    the lowpart of a particular hard register, or represent one or more
    complete hard registers.

?  We could instead say something like:

    @cindex @code{CANNOT_CHANGE_MODE_CLASS} and subreg semantics
    These rules apply to both pseudo inner registers and hard inner
    registers.  If the semantics are not correct for particular
    combinations of @var{m1}, @var{m2} and hard @var{reg},
    the target-support code must ensure that those combinations
    are never used.  For example:

    @smallexample
    CANNOT_CHANGE_MODE_CLASS (@var{m2}, @var{m1}, @var{class})
    @end smallexample

    must be true for every class @var{class} that includes @var{reg}.

> What would make this still somewhat saner, though, would be if we had
> a mechanism to make the subreg mechanism use different word sizes for
> different inner modes for the purpose of identifying regions that are
> wholely clobbered.  So, if you have 64 bit word_mode, but 32 bit floating
> point registers, you could say that SUBREGS for floating point modes should
> behave like you had a 32 bit word_mode.
> Conversely, if you have 128 bit vector registers, you might want a DImode
> subreg of a matching vector mode to clobber the entire pseudo.
> Maybe a BITS_PER_REG (MODE) value, or an equivalent hook.

I'm not sure either way about that TBH.

>> > Huh?  The documentation says that zero_extract follows BITS_BIG_ENDIAN,
>> > so the memory layout doesn't come into play.  We have a 64 bit value,
>> > and BITS_BIG_ENDIAN determines which bits are meant.
>> 
>> So you're saying that, if the above REG:DI were replaced by a MEM:DI,
>> the zero_extract would represent a non-contigous bitrange?
>
> Yes, non-contiguous in memory, but contiguous in positional value.
> This still gives a port maintainer more freedom than prohibiting the
> zero_extract for these modes altogether.  If you like, you can put
> a caveat in the documentation to replace the restriction.
> Using BITS_BIG_ENDIAN throughout makes perfect sense when you operate on
> a value that is generally in a register but might on occaison end up in
> memory.
>
> Having an rtx code mean different things when applied to memory rather than
> registers is only asking for trouble, since then your operations change
> when a register gets spilled to memory.

Oh, I agree completely.  The behaviour should not depend on whether
the operand's a REG or a MEM; there should be one rule that applies
to both.  My point was that it isn't obvious which rule is the right
one in some cases.  The "natural" rule for MEMs isn't quite so natural
for REGs, and vice versa.

But anyway, let's not get side-tracked by this.  I think we agree
that gcc currently doesn't support multiword extracts, so let's
file this under "nice to have".  I don't think the current extract
documentation is misleading here.

Richard

Re: [trunk] Addition to subreg section of rtl.text.

Reply via email to