On 01/11/2017 06:30 PM, Palmer Dabbelt wrote:
+(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
+ "A floating-point register (if available).")
+
I know this is the Traditional Way, but I do wonder if using the new enable
attribute on the alternatives would be better. No need to change change; just
something that makes me wonder.
+(define_constraint "Q"
+ "@internal"
+ (match_operand 0 "const_arith_operand"))
How does this differ from "I"?
+
+(define_memory_constraint "W"
+ "@internal
+ A memory address based on a member of @code{BASE_REG_CLASS}."
+ (and (match_code "mem")
+ (match_operand 0 "memory_operand")))
How does this differ (materially) from "m"? Or from just
(match_operand 0 "memory_operand")
for that matter?
+;;........................
+;; DI -> SI optimizations
+;;........................
+
+;; Simplify (int)(a + 1), etc.
+(define_peephole2
+ [(set (match_operand:DI 0 "register_operand")
+ (match_operator:DI 4 "modular_operator"
+ [(match_operand:DI 1 "register_operand")
+ (match_operand:DI 2 "arith_operand")]))
+ (set (match_operand:SI 3 "register_operand")
+ (truncate:SI (match_dup 0)))]
+ "TARGET_64BIT && (REGNO (operands[0]) == REGNO (operands[3]) ||
peep2_reg_dead_p (2, operands[0]))
+ && (GET_CODE (operands[4]) != ASHIFT || (CONST_INT_P (operands[2]) && INTVAL
(operands[2]) < 32))"
+ [(set (match_dup 3)
+ (truncate:SI
+ (match_op_dup:DI 4
+ [(match_operand:DI 1 "register_operand")
+ (match_operand:DI 2 "arith_operand")])))])
I take from this that combine + ree fail to do the job?
I must say I'm less than thrilled about the use of truncate instead of simply
properly representing the sign-extensions that are otherwise required by the
ABI. RISCV is unlike MIPS in that the 32-bit operations (addw et al) do not
require sign-extended inputs in order to produce a correct output value.
Consider Alpha as a counter-example to MIPS, where the ABI requires
sign-extensions at function edges and the ISA requires sign-extensions for
comparisons.
+(define_insn "*local_pic_storesi<mode>"
+ [(set (mem:ANYF (match_operand 0 "absolute_symbolic_operand" ""))
+ (match_operand:ANYF 1 "register_operand" "f"))
+ (clobber (reg:SI T0_REGNUM))]
+ "TARGET_HARD_FLOAT && !TARGET_64BIT && USE_LOAD_ADDRESS_MACRO (operands[0])"
+ "<store>\t%1,%0,t0"
+ [(set (attr "length") (const_int 8))])
Use match_scratch so that you need not hard-code T0.
And likewise for the other pic stores.
+(define_predicate "const_0_operand"
+ (and (match_code "const_int,const_double,const_vector")
+ (match_test "op == CONST0_RTX (GET_MODE (op))")))
Probably want to support const_wide_int.
+ /* Don't handle multi-word moves this way; we don't want to introduce
+ the individual word-mode moves until after reload. */
+ if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+ return false;
Why not? Do the subreg passes not do a good job? I would expect it to do well
on a target like RISCV.
+(define_predicate "move_operand"
+ (match_operand 0 "general_operand")
Normally this is handled by TARGET_LEGITIMATE_CONSTANT_P.
+/* Target CPU builtins. */
+#define TARGET_CPU_CPP_BUILTINS() \
+ do \
+ { \
+ builtin_define ("__riscv"); \
Perhaps better to move this to riscv-c.c?
+ if (TARGET_MUL) \
+ builtin_define ("__riscv_mul"); \
+ if (TARGET_DIV) \
+ builtin_define ("__riscv_div"); \
+ if (TARGET_DIV && TARGET_MUL) \
+ builtin_define ("__riscv_muldiv"); \
Out of curiosity, why are these split when the M extension doesn't do so?
+/* Define this macro if it is advisable to hold scalars in registers
+ in a wider mode than that declared by the program. In such cases,
+ the value is constrained to be within the bounds of the declared
+ type, but kept valid in the wider mode. The signedness of the
+ extension may differ from that of the type. */
+
+#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \
+ if (GET_MODE_CLASS (MODE) == MODE_INT \
+ && GET_MODE_SIZE (MODE) < 4) \
+ { \
+ (MODE) = Pmode; \
+ }
I think you want to force unsignedp false for SImode when extending to DImode.
+/* a0-a7, t0-a6, fa0-fa7, and ft0-ft11 are volatile across calls.
+ The call RTLs themselves clobber ra. */
+
+#define CALL_USED_REGISTERS \
+{ /* General registers. */ \
+ 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, \
+ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, \
+ /* Floating-point registers. */ \
+ 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, \
+ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, \
+ /* Others. */ \
+ 1, 1 \
+}
+
+#define CALL_REALLY_USED_REGISTERS \
+{ /* General registers. */ \
+ 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, \
+ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, \
+ /* Floating-point registers. */ \
+ 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, \
+ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, \
+ /* Others. */ \
+ 1, 1 \
+}
You shouldn't have to replicate these if they're the same.
+/* Return coprocessor number from register number. */
+
+#define COPNUM_AS_CHAR_FROM_REGNUM(REGNO) \
+ (COP0_REG_P (REGNO) ? '0' : COP2_REG_P (REGNO) ? '2' \
+ : COP3_REG_P (REGNO) ? '3' : '?')
+
MIPS cruft.
+typedef struct {
+ /* Number of integer registers used so far, up to MAX_ARGS_IN_REGISTERS. */
+ unsigned int num_gprs;
+
+ /* Number of floating-point registers used so far, likewise. */
+ unsigned int num_fprs;
+} CUMULATIVE_ARGS;
Related to something I mentioned for patch 1: why isn't this just a single int?
+(define_c_enum "unspec" [
+ ;; GP manipulation.
+ UNSPEC_EH_RETURN
+
+ ;; Symbolic accesses. The order of this list must match that of
+ ;; enum riscv_symbol_type in riscv-protos.h.
+ UNSPEC_ADDRESS_FIRST
+ UNSPEC_PCREL
+ UNSPEC_LOAD_GOT
+ UNSPEC_TLS
+ UNSPEC_TLS_LE
+ UNSPEC_TLS_IE
+ UNSPEC_TLS_GD
+
+ UNSPEC_AUIPC
+
+ ;; Register save and restore.
+ UNSPEC_GPR_SAVE
+ UNSPEC_GPR_RESTORE
+
+ ;; Blockage and synchronisation.
+ UNSPEC_BLOCKAGE
+ UNSPEC_FENCE
+ UNSPEC_FENCE_I
+])
Some of these are unspec_volatile, and so should be in define_c_enum "unspecv".
+(define_expand "add<mode>3"
+ [(set (match_operand:GPR 0 "register_operand")
+ (plus:GPR (match_operand:GPR 1 "register_operand")
+ (match_operand:GPR 2 "arith_operand")))]
+ "")
+
+(define_insn "*addsi3"
+ [(set (match_operand:SI 0 "register_operand" "=r,r")
+ (plus:SI (match_operand:GPR 1 "register_operand" "r,r")
+ (match_operand:GPR2 2 "arith_operand" "r,Q")))]
+ ""
+ { return TARGET_64BIT ? "addw\t%0,%1,%2" : "add\t%0,%1,%2"; }
+ [(set_attr "type" "arith")
+ (set_attr "mode" "SI")])
Err... mismatched input modes? That's wrong.
+
+(define_insn "*adddi3"
+ [(set (match_operand:DI 0 "register_operand" "=r,r")
+ (plus:DI (match_operand:DI 1 "register_operand" "r,r")
+ (match_operand:DI 2 "arith_operand" "r,Q")))]
+ "TARGET_64BIT"
+ "add\t%0,%1,%2"
+ [(set_attr "type" "arith")
+ (set_attr "mode" "DI")])
Remove the addsi3 weirdness and it doesn't appear that you need the
define_expand.
Likewise for sub and mul.
+(define_expand "<u>mulditi3"
+ [(set (match_operand:TI 0 "register_operand")
+ (mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
+ (any_extend:TI (match_operand:DI 2 "register_operand"))))]
+ "TARGET_MUL && TARGET_64BIT"
+{
+ rtx low = gen_reg_rtx (DImode);
+ emit_insn (gen_muldi3 (low, operands[1], operands[2]));
+
+ rtx high = gen_reg_rtx (DImode);
+ emit_insn (gen_<u>muldi3_highpart (high, operands[1], operands[2]));
+
+ emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+ emit_move_insn (gen_highpart (DImode, operands[0]), high);
+ DONE;
+})
FWIW, the manual recommends the sequence MULHU; MUL if you need both the high
and low parts. Whether that implies that you keep those together as a unit
until final output, or whether you simply emit them in the other order, one
would need to consult an actual hardware manual.
+(define_insn "floatsidf2"
+ [(set (match_operand:DF 0 "register_operand" "=f")
+ (float:DF (match_operand:SI 1 "reg_or_0_operand" "rJ")))]
+ "TARGET_DOUBLE_FLOAT"
+ "fcvt.d.w\t%0,%z1"
+ [(set_attr "type" "fcvt")
+ (set_attr "mode" "DF")
+ (set_attr "cnv_mode" "I2D")])
+
+
+(define_insn "floatdidf2"
+ [(set (match_operand:DF 0 "register_operand" "=f")
+ (float:DF (match_operand:DI 1 "reg_or_0_operand" "rJ")))]
+ "TARGET_64BIT && TARGET_DOUBLE_FLOAT"
+ "fcvt.d.l\t%0,%z1"
+ [(set_attr "type" "fcvt")
+ (set_attr "mode" "DF")
+ (set_attr "cnv_mode" "I2D")])
+
Merge with GPR? And likewise for other fp conversions.
+;; Instructions for adding the low 16 bits of an address to a register.
paste-o: s/16/12/
+(define_insn "*low<mode>"
+ [(set (match_operand:P 0 "register_operand" "=r")
+ (lo_sum:P (match_operand:P 1 "register_operand" "r")
+ (match_operand:P 2 "symbolic_operand" "")))]
+ ""
+ "add\t%0,%1,%R2"
Really add, not addi?
+;; Unlike most other insns, the move insns can't be split with '
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
This comment is incorrect. In general you can't split non-moves this way
either.
+;; HImode constant generation; see riscv_move_integer for details.
+;; si+si->hi without truncation is legal because of TRULY_NOOP_TRUNCATION.
+
+(define_insn "add<mode>hi3"
+ [(set (match_operand:HI 0 "register_operand" "=r,r")
+ (plus:HI (match_operand:HISI 1 "register_operand" "r,r")
+ (match_operand:HISI 2 "arith_operand" "r,Q")))]
+ ""
+ { return TARGET_64BIT ? "addw\t%0,%1,%2" : "add\t%0,%1,%2"; }
+ [(set_attr "type" "arith")
+ (set_attr "mode" "HI")])
+
+(define_insn "xor<mode>hi3"
+ [(set (match_operand:HI 0 "register_operand" "=r,r")
+ (xor:HI (match_operand:HISI 1 "register_operand" "r,r")
+ (match_operand:HISI 2 "arith_operand" "r,Q")))]
+ ""
+ "xor\t%0,%1,%2"
+ [(set_attr "type" "logical")
+ (set_attr "mode" "HI")])
I see from the comment you want to use these for HImode constant generation.
However, I do wonder if it wouldn't be better to name these such that they
don't get used during normal rtl expansion.
Why do you support HI and QImode moves into FP registers?
+;; Expand in-line code to clear the instruction cache between operand[0] and
+;; operand[1].
+(define_expand "clear_cache"
+ [(match_operand 0 "pmode_register_operand")
+ (match_operand 1 "pmode_register_operand")]
+ ""
+ "
+{
+ emit_insn (gen_fence_i ());
+ DONE;
+}")
Do you need a FENCE before the FENCE.I?
+(define_insn "call_value_multiple_internal"
+ [(set (match_operand 0 "register_operand" "")
+ (call (mem:SI (match_operand 1 "call_insn_operand" "l,S"))
+ (match_operand 2 "" "")))
+ (set (match_operand 3 "register_operand" "")
+ (call (mem:SI (match_dup 1))
+ (match_dup 2)))
Any reason for this? Your return value registers are sequential. The normal
thing to do is just use e.g. (reg:TI 10).
+ case MEMMODEL_SEQ_CST:
+ case MEMMODEL_SYNC_SEQ_CST:
+ case MEMMODEL_ACQ_REL:
+ return "fence rw,rw";
+ case MEMMODEL_ACQUIRE:
+ case MEMMODEL_SYNC_ACQUIRE:
+ case MEMMODEL_CONSUME:
+ return "fence r,rw";
+ case MEMMODEL_RELEASE:
+ case MEMMODEL_SYNC_RELEASE:
+ return "fence rw,w";
The memory models say whether any memory operation can cross the barrier, not
what kind of operations might cross. I think you have to use rw,rw always.
r~