On Tue, Mar 18, 2008 at 09:40:49PM +0000, Richard Sandiford wrote: > > The most natural layout would be 0x45??0123 . > > But you could also have 0x345?012? , or even more exotic mappings. > > Do we actually support the second mapping though? Surely the > target-independent code needs to know how bytes are divided into words?
I don't see why the target-independent code would need to know what the bits inside a partial integer mode mean. A partial exception to this is when aritmetic for partial integers has to be implemented using arithmetic for integral integers; in this case, it is assumed that moving partial integers to integral integers, performing the arithmetic, and moving back to partial integers will produce the right result. So, if partial integer addition or subtraction is present, and no named pattern for these operations exits, this implies that valid bits are contiguous, and that any unused lower bits will read as zero (assuming we are actually dealing with bits here. Stranger scenarious are possible, e.g. mod 81 arithmtic.) > The reason Kenny's looking at this is that he wants to track which > bytes in a SUBREG are actually live. A conservative assumption is that all bits occupied by the integral mode the partial integral mode is associated with are live. If we really find that there is a code quality issue when making this assumption, we can add a hook to define the salient semantics, but I doubt this will come up. > >> 3) What about things like 80-bit FP modes on a 32-bit or 64-bit target? Is > >> it valid to refer to pieces of an 80-bit FP pseudo? If so, are the rules > >> we've got here right? > > > > Where the 80-bit mode is stored in multiple words like for x86, you > > should be able to refer to word_mode subregs the way the value is > > stored in memory. This is the only way you can get a sane equivalence > > between reloads via secondary memory and direct register-register > > moves invollving word_mode GENERAL_REGS. > > OK, so in all these cases, "N words and a bit" modes can be treated > like "N + 1 words, with the upper bits undefined"? For both inner > and outer modes? N + 1 words, yes, but it doesn't follow that it must be the upper bits that are undefined. If that is actually the case, however, for an 80 bit value on a little-endian byte-addressed the target, the port could refer to the bits in the highest words as (subreg:HI (reg:XF inner_reg) 8) or (subreg:HI (mem:XF mem_addr) 8) to make this explicit. However, what would we do with a true-blue big endian target? Would the highest bits be (subreg:HI (reg:XF inner_reg 2)) ? > >> 4) Do stores to subregs of hardreg invalidate just the registers > >> mentioned in the outer mode or do they invalidate the entire set of > >> registers mentioned in the inner mode? (our rules say only the outer > >> mode). > > > > Where the hardreg is actually a single hardware register, all of it is > > clobbered. If it is a concatenation of multiple actual hard > > registers, the idea is that only the one that corresponds to the word > > that is stored into gets clobbered. If more than one word is stored > > into, that would logically translate to changing each of the registers > > that each word corresponds to. > > > > What seems less defined is what happens when the underlying hard registers > > are smaller than a word, and either the mode size or SUBREG_BYTE > > is not a multiple of a word. > > Yeah, my version of the question was more: do we support subregs of > hard registers in which the normal word-based semantics of pseudos > do not apply? Having some data registers larger than word size is quite common, particularily floating point registers on machines with a word size smaller than the largest supported floating point mode. IIRC we support this, but not very well. Where the hardware allows transfers bewteen differently sized registers, it seems most natural to use SUBREGs to express this. IIRC you have to do something like (SUBREG:SI (SUBREG:DI (REG:DF... and even spread it across multiple instruction patterns. I don't see why we should be picky about the MODE_CLASS of inner or outer modes of SUBREGs. If individual portions of multiple-word registers can be accessed individually like normal registers, it makes sense to mode the individual parts as separate registers, but it is essential that all parts can be both read from and writen to separately with moves from/to general purpose registers to make this work sanely. Also, group spill allocation has extra costs in several ways, so if the predominant way to use the wide registers is to use them as a whole, it is still desirable to model them as wide registers and have the narrower accesses use SUBREG and/or zero_extract. But there is also part of an answer here for the original question: when a wide register is only partially available as separate words, it is more likely to be available as separate values to read. If you can't write separate parts separately, it follows that a subreg write would naturally clobber the entire register. There is a problem, though, with considering zero_extract as an escape hatch if you do want to access only part of the register in sepcial circumstances: the documentation says that applied on memory, the inner mode must be byte-sized - this will certainly be violated in reload - and that for registers, the mode will be that of extv / insv. Not all processors have extv / insv instructions, and even if they had, you might need more than one inner mode in different circumstances. Why are we making any stipulations about the inner mode? > The current documentation expressly forbids taking > an SImode subreg of a DImode hard register on a 32-bit machine, Huh? Then all our 32 bit ports which support long long must be broken. > for example, and I agree that the subword hard register case is > also suspicious. I suppose it just doesn't happen often enough for anybody to have any strong opinion one way or other. I suppose you can always express this with a zero_extract, so it would only become important if we had to worry about memory footprint of or processing time for zero_extract. So, pragmatically, I suppose we should go with whatever prohibition or definition allows the fastest implementation. > Without wanting to fan flames, isn't this something that should > be fixed in reload? ;) Reload is amenable to change... We've already discussed this 16 months ago: http://gcc.gnu.org/ml/gcc-patches/2006-11/msg01074.html FWIW, I did a small reload patch to my experimental local sources yesterday to tinker with reload types for a 0.2% size gain.