"Maciej W. Rozycki" <ma...@codesourcery.com> writes: > This issue was originally raised here: > > http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html > > We have a shortcoming in GCC in that we only allow the use half of the FP > MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register > mode (CP0.Status.FR == 1) on MIPS32r2 processors. Furthermore we never > enable the other half (NMADD.fmt and NMSUB.fmt) on those processors. > However this whole instruction subset is always available on MIPS32r2 FPUs > regardless of the mode selected, just as it always has been on FPUs of the > 64-bit ISA line from MIPS IV up.
Hmm, this was discussed here: http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00488.html http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00492.html The footnote for COP1X instructions on page 12 of volume 1 of the MIPS32 ISA (v2.50) says: 1. In Release 1 of the Architecture, these instructions are legal only with a MIPS64 processor with 64-bit operations enabled (they are, in effect, actually MIPS64 instructions). In Release 2 of the Architecture, these instructions are legal with either a MIPS32 or MIPS64 processor _which includes a 64-bit floating point unit_. (my emphasis). "which" rather than "that" makes this a bit ambiguous, but various comments in the rest of the manual imply that MIPS32r2 allows an implementation choice between 32-bit and 64-bit FPUs. E.g. page 8 says: Support for 64-bit coprocessors with 32-bit CPUs: These changes allow a 64-bit coprocessor (including an FPU) to be attached to a 32-bit CPU. This enhancement is optional in a Release 2 implementation. and page 45 says: In addition to an Instruction Set Architecture, the MIPS architecture definition includes processing resources such as the set of coprocessor general registers. In Release 1 of the Architecture, the 32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64; however, these 64-bit FPU registers are not backwards compatible. Instead, processors implementing the MIPS64 Architecture provide a mode bit to select either the 32-bit or 64-bit register model. In Release 2 of the Architecture, a 32-bit CPU _may_ include a full 64-bit coprocessor, including a floating point unit which implements the same mode bit to select 32-bit or 64-bit FPU register model. On page 322 of volume 2, the footnote for "Table A-20 MIPS64 COP1X Encoding of Function Field" uses slightly different wording: COP1X instructions are legal only if 64-bit floating point operations are enabled. So was this all a big misunderstanding on my part? The TARGET_FLOAT64 condition came from MIPS themselves, and when challenged they seemed pretty adamant that it was correct. If I was wrong to be convinced by the explanation, I hope you can at least see why I was convinced. :-) If it wasn't a misunderstanding, then the point is that we can't tell from ISA_MIPS32R2 alone whether the target has a 32-bit or 64-bit FPU, but we know that it must have a 64-bit FPU if using TARGET_FLOAT64. > Also, according to MIPS IV ISA documentation these operations are only > fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on > the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention > any limitations of these instructions either, so I have updated the GCC > manual to document that on non-R8000 CPUs (which are ones we really care > about) they are numerically equivalent to computations made with > corresponding individual operations. This part is OK, thanks, and is probably the only bit that's suitable for 4.8 at this stage. Would you mind applying it separately? > Finally, while at it, I found it interesting that we have separate > conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and > NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four > instructions need to be implemented as a whole group per data format > supported and cannot be separated (the MIPS architecture specification > explicitly forbids subsetting). The difference between the two conditions > is the former expands to ISA_HAS_FP4, that is enables the subsubset for > any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 > || TARGET_MAD)" qualifier. > > I went ahead and checked available NEC VR54xx documentation and here's > what I came up with: > > 1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) > says: > > "The VR5400 processor family complies with the MIPS IV instruction set > and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, > [...]" > > 2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc > #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and > NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction > Set" with no restrictions as to their availability (the only other > member of the VR54xx family I know of is the VR5464 that is a > high-performance version of the VR5432 and is fully software > compatible). > > Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' > instructions" that are all CPU rather than FPU instructions. The VR5432 > indeed supports extra integer multiply-accumulate instructions, as > documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and > MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our > ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not > implied for TARGET_MIPS5400, perhaps because the family does not support > the doubleword variants). > > All in all it looks to me like a misplaced hunk. It was introduced in > rev. 56471 (you were named as one of the contributors on that commit, so > you may be able to remember and/or correct me if I am wrong here anywhere) > and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB > macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 > (and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; > the bodies were close enough back then for a hunk to apply cleanly to > either). I was named in that commit but the VR54xx stuff wasn't mine. I do remember that Mike put a lot of effort into tuning the VR54xx madd stuff though, because of the difficulty of having multiply-accumulate instructions that force the use of HI/LO on an architecture that also has efficient three-operand multiplies. So I'm pretty sure that this worked correctly in the Cygnus devo tree, and your explanation of a misplaced hunk seems very convincing. Richard