Joern Rennecke <[EMAIL PROTECTED]> writes: > On Thu, Mar 20, 2008 at 10:39:47AM +0000, Richard Sandiford wrote: >> you're saying that, for any valid values of M and X: >> >> (set (subreg:M (reg:N ...) X) (const_int 0)) >> >> does not guarantee that (subreg:M (reg:N ...) ...) has the value 0 >> if N is a partial mode? > > Yes. Although it will be more common for 1 bits to change to zero, some bits > might actually differ between successive reads, when these bits are status > flags. AFAICR there was even some processor that had a level-sensitive > I/O port bit mapped in its status register.
Well, the part you've quoted was only an example of the earlier, more general statement: ... partial modes behave "as if" their widths were rounded up to the next word boundary, but that an unspecified collection of bits in the extended width will read as undefined which I think covers your case too. It was more the general statement that I was interested in, because it would form the basis of an updated rtl.texi patch. >> > I think some of the rules are overly restrictive, and prevent gcc >> > from archiving its full potential for generating efficient code. >> > Moreover, if a port has an extv / insv pattern that matches in mode with >> > the >> > wide registers, it can legitimately use the zero_extract route. It's >> > reload that contradicts the documentation in changing registers into MEMs >> > and thus creating zero_extracts from wide MEMs. >> >> It sounds like you might be referring to both the subreg and extract >> documentation here. As far as the subreg documentation goes, >> let's assume that what I said above about partial modes is right >> (you'll have already corrected me by now if not). If we change the >> rules to say that, what do you think is still overly restrictive? > > - zero_extract officially only allowed for a specific mode. > - nested subregs not allowed, but neither are all subregs that > would result from substituting a subreg into an inner reg of another > subreg and simplifying allowed. > - highpart subregs not allowed (e.g. consider SH64 floating point registers: > word_mode is 64, but the floating point registers are 32 bit. How do > you refer to the high part of a DFmode value, considering that the > inner reg might be allocated to a floating value. (Actually, generally > want such an allocation).) Sorry, my question was unclear. I really meant: what in the proposed rtl.texi rules is too restrictive _given what the current code is supposed to allow_? We're trying to write down what is currently folklore, and we're trying to figure out if we've expressed the folklore correctly. Whereas the things you've listed above are not supported by the current code; they're possible future extensions. >> E.g. one possibility would be to drop: >> >> If @var{reg} is a hard register, the @code{subreg} must also represent >> the lowpart of a particular hard register, or represent one or more >> complete hard registers. >> >> and instead say that the word-based semantics for pseudo registers also >> apply to hard registers, regardless of the number of hard registers in >> the inner register. This would in some ways be simpler. > > Yes. If subword-writing semantics are wanted, and the SUBREG represents > an actual hard register, than the port has to use a proper hard reg instead. > And it makes semantics much saner when we do register allocation for a > pseudo where we don't know the register size to start with. Thanks. So in summary, it's OK for these rules in the proposed rtl.texi patch: When @var{m1} is at least as narrow as @var{m2} the @code{subreg} expressions is called @dfn{normal}. Normal subregs restrict consideration to certain bits of @var{reg}. There are two cases. If @var{m1} is smaller than a word, the @code{subreg} refers to the least-significant part (or @dfn{lowpart}) of one word of @var{reg}. If @var{m1} is word-sized or greater, the @code{subreg} refers to one or more complete words. When @var{m2} is larger than a word, the subreg is a @dfn{multi-word outer subreg}. When used as an lvalue, @code{subreg} is a word-based accessor. Storing to a @code{subreg} modifies all the words of @var{reg} that overlap the @code{subreg}, but it leaves the other words of @var{reg} alone. When storing to a normal @code{subreg} that is smaller than a word, the other bits of the referenced word are usually left in an undefined state. This laxity makes it easier to generate efficient code for such instructions. To represent an instruction that preserves all the bits outside of those in the @code{subreg}, use @code{strict_low_part} or @code{zero_extract} around the @code{subreg}. to apply to hard registers as well as pseudos, without the additional restriction: If @var{reg} is a hard register, the @code{subreg} must also represent the lowpart of a particular hard register, or represent one or more complete hard registers. ? We could instead say something like: @cindex @code{CANNOT_CHANGE_MODE_CLASS} and subreg semantics These rules apply to both pseudo inner registers and hard inner registers. If the semantics are not correct for particular combinations of @var{m1}, @var{m2} and hard @var{reg}, the target-support code must ensure that those combinations are never used. For example: @smallexample CANNOT_CHANGE_MODE_CLASS (@var{m2}, @var{m1}, @var{class}) @end smallexample must be true for every class @var{class} that includes @var{reg}. > What would make this still somewhat saner, though, would be if we had > a mechanism to make the subreg mechanism use different word sizes for > different inner modes for the purpose of identifying regions that are > wholely clobbered. So, if you have 64 bit word_mode, but 32 bit floating > point registers, you could say that SUBREGS for floating point modes should > behave like you had a 32 bit word_mode. > Conversely, if you have 128 bit vector registers, you might want a DImode > subreg of a matching vector mode to clobber the entire pseudo. > Maybe a BITS_PER_REG (MODE) value, or an equivalent hook. I'm not sure either way about that TBH. >> > Huh? The documentation says that zero_extract follows BITS_BIG_ENDIAN, >> > so the memory layout doesn't come into play. We have a 64 bit value, >> > and BITS_BIG_ENDIAN determines which bits are meant. >> >> So you're saying that, if the above REG:DI were replaced by a MEM:DI, >> the zero_extract would represent a non-contigous bitrange? > > Yes, non-contiguous in memory, but contiguous in positional value. > This still gives a port maintainer more freedom than prohibiting the > zero_extract for these modes altogether. If you like, you can put > a caveat in the documentation to replace the restriction. > Using BITS_BIG_ENDIAN throughout makes perfect sense when you operate on > a value that is generally in a register but might on occaison end up in > memory. > > Having an rtx code mean different things when applied to memory rather than > registers is only asking for trouble, since then your operations change > when a register gets spilled to memory. Oh, I agree completely. The behaviour should not depend on whether the operand's a REG or a MEM; there should be one rule that applies to both. My point was that it isn't obvious which rule is the right one in some cases. The "natural" rule for MEMs isn't quite so natural for REGs, and vice versa. But anyway, let's not get side-tracked by this. I think we agree that gcc currently doesn't support multiword extracts, so let's file this under "nice to have". I don't think the current extract documentation is misleading here. Richard