Matthew Fortune <matthew.fort...@imgtec.com> writes: > Richard Sandiford <rdsandif...@googlemail.com> writes: >> Matthew Fortune <matthew.fort...@imgtec.com> writes: >> > I've realised that I may need to do 'something' to prevent GCC from >> loading or >> > storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 >> when using >> > an unaligned address. Initial tests show that when loading from an >> unaligned >> > address (4-byte aligned) then GCC loads the two halves of a 64-bit >> value into >> > GPRs and then moves across to FPRs. This is good but I don't know if >> it is >> > guaranteed. >> > >> > From what I can tell the backend doesn't specifically deal with >> loading >> > unaligned data but rather the normal load/store patterns are used by >> the >> > middle end. As such I'm not sure there is anything to prevent direct >> loads >> > to FPRs by parts. >> > >> > Do you know one way or the other if unaligned doubles can currently be >> loaded >> > via pairs of lwc1 (same for store) and if so can you advise on an >> approach I >> > could try to prevent this for FPXX? I will try to figure this out on >> my own in >> > the meantime. >> >> The port does handle some widths of unaligned access via the >> {insv,extv,extzv}misalign patterns. That's how an unaligned DImode >> value will be handled on 64-bit targets. >> >> The MIPS *misalign patterns only handle integer modes, so for other >> types of mode the target-independent code will fall back to using an >> integer load followed by a subreg (or a subreg followed by an integer >> store). IIRC that's how an unaligned DFmode access will be handled on >> 64-bit targets. >> >> For modes that are larger or smaller than *misalign can handle, >> the target-independent code has to split the access up into smaller >> pieces and reassemble them. And these pieces have to have integer >> modes. >> E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be >> done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4). >> The thing that prevents these individual loads from using LWC1 is >> CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid >> for any target-independent code to reduce a subreg of an FPR pair >> to an individual FPR. >> >> [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for >> things >> like DImode on 32-bit targets is because there's no special architecture >> feature than can be used. It's just a case of decomposing the access. >> Since that's a general technique, we want to make the target-independent >> code do it as well as possible rather than handle it in a port-specific >> way.] >> >> So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of >> unaligned >> floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS >> should >> guarantee what you want. > > Thanks for all the details. I did not know if CANNOT_CHANGE_MODE_CLASS > would give this guarantee as I am conscious of MIPS 1 having to do pairs > of LWC1/SWC1 for doubles. However, if I understand the code correctly > the LWC1/SWC1 pairs are generated by splits for MIPS 1 and not directly > from target-independent code so CANNOT_CHANGE_MODE_CLASS does not impact > that explicit splitting logic. Is that right?
Yeah, that's right. All splits of floating-point values are done in the MIPS code and the MIPS code is free to ignore CANNOT_CHANGE_MODE_CLASS. One of the many reasons for defining CANNOT_CHANGE_MODE_CLASS the way it is now is that the FPRs are always little-endian. If the target-independent code ever did try to access one half of a pair, it would access the wrong one on big-endian targets. So CCCM is by no means just a technicality at the moment. Other targets also rely on CCCM for similarly important reasons. Thanks, Richard