First, I will split this patch in two. The first part will deal with the FPU instructions. The second patch, will try to address a new abi optimized for odd-even registers as the comments for the mabi=optimized are numerous and I need to carefully prepare for an answer. The remaining of this email will focus on FPU patch.
> + case EQ: > + case NE: > + case UNORDERED: > + case UNLT: > + case UNLE: > + case UNGT: > + case UNGE: > + return CC_FPUmode; > + > + case LT: > + case LE: > + case GT: > + case GE: > + case ORDERED: > + return CC_FPUEmode; > > cse and other code transformations are likely to do better if you use > just one mode for these. It is also very odd to have comparisons and their > inverse use different modes. Have you done any benchmarking for this? Right, the ORDERED should be in CC_FPUmode. An inspiration point for CC_FPU/CC_FPUE mode is the arm port. The reason why having the two CC_FPU and CC_FPUE modes is to emit signaling FPU compare instructions. We can use a single CC_FPU mode here instead of two, but we may lose functionality. Regarding benchmarks, I do not have an establish benchmark for this, however, as far as I could see the code generated for FPU looks clean. Please let me know if it is acceptable to go with CC_FPU/CC_FPUE, and ORDERED fix further on. Or, to have a single mode. > + /* ARCHS has 64-bit data-path which makes use of the even-odd paired > + registers. */ > + if (TARGET_HS) > + { > + for (regno = 1; regno < 32; regno +=2) > + { > + arc_hard_regno_mode_ok[regno] = S_MODES; > + } > + } > + > > Does TARGET_HS with -mabi=default allow for passing DFmode / DImode > arguments > in odd registers? I fear you might run into reload trouble when trying to > access the values. Although, I haven't bump into this issue until now, I do not say it may not happen. Hence, I would create a new register class to hold the odd-even registers. Hence the above code will not be needed. What do u say? > still in "subdf3": > + else if (TARGET_FP_DOUBLE) > > So this implies that both (TARGET_DPFP) and (TARGET_FP_DOUBLE) might > be > true at > the same time. In that case, so we really want to prefer the > (TARGET_DPFP) expansion? The TARGET_DPFP (FPX instructions) and TARGET_FP_DOUBLE (FPU) are mutually exclusive. It should be a check in arc_init() function for this case. > +(define_insn "*cmpsf_trap_fpu" > > That name makes as little sense to me as having two separate modes > CC_FPU and CC_FPUE > for positive / negated usage and having two comparison patterns pre > precision that > do the same but pretend to be dissimilar. > The F{S/D}CMPF instruction is similar to the F{S/D}CMP instruction in cases when either of the instruction operands is a signaling NaN. The FxCMPF instruction updates the invalid flag (FPU_STATUS.IV) when either of the operands is a quiet or signaling NaN, whereas, the FxCMP instruction updates the invalid flag (FPU_STATUS.IV) only when either of the operands is a quiet NaN. We need to use the FxCMPF only if we keep the CC_FPU an CC_FPUE otherwise, we shall use only FxCMP instruction. > Also, the agglomeration of S/D with FU{S,Z}ED is confusing. Could you > spare another underscore? Is this better? #define TARGET_FP_SP_BASE ((arc_fpu_build & FPU_SP) != 0) #define TARGET_FP_DP_BASE ((arc_fpu_build & FPU_DP) != 0) #define TARGET_FP_SP_FUSED ((arc_fpu_build & FPU_SF) != 0) #define TARGET_FP_DP_FUSED ((arc_fpu_build & FPU_DF) != 0) #define TARGET_FP_SP_CONV ((arc_fpu_build & FPU_SC) != 0) #define TARGET_FP_DP_CONV ((arc_fpu_build & FPU_DC) != 0) #define TARGET_FP_SP_SQRT ((arc_fpu_build & FPU_SD) != 0) #define TARGET_FP_DP_SQRT ((arc_fpu_build & FPU_DD) != 0) #define TARGET_FP_DP_AX ((arc_fpu_build & FPX_DP) != 0) Thanks, Claudiu