Re: [PATCH 2/6] RISC-V Port: gcc

Richard Henderson Fri, 20 Jan 2017 22:42:25 -0800

On 01/11/2017 06:30 PM, Palmer Dabbelt wrote:

+(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
+  "A floating-point register (if available).")
+

I know this is the Traditional Way, but I do wonder if using the new enableattribute on the alternatives would be better. No need to change change; justsomething that makes me wonder.

+(define_constraint "Q"
+  "@internal"
+  (match_operand 0 "const_arith_operand"))


How does this differ from "I"?

+
+(define_memory_constraint "W"
+  "@internal
+   A memory address based on a member of @code{BASE_REG_CLASS}."
+  (and (match_code "mem")
+       (match_operand 0 "memory_operand")))


How does this differ (materially) from "m"?  Or from just

  (match_operand 0 "memory_operand")

for that matter?

+;;........................
+;; DI -> SI optimizations
+;;........................
+
+;; Simplify (int)(a + 1), etc.
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand")
+       (match_operator:DI 4 "modular_operator"
+         [(match_operand:DI 1 "register_operand")
+          (match_operand:DI 2 "arith_operand")]))
+   (set (match_operand:SI 3 "register_operand")
+       (truncate:SI (match_dup 0)))]
+  "TARGET_64BIT && (REGNO (operands[0]) == REGNO (operands[3]) || 
peep2_reg_dead_p (2, operands[0]))
+   && (GET_CODE (operands[4]) != ASHIFT || (CONST_INT_P (operands[2]) && INTVAL 
(operands[2]) < 32))"
+  [(set (match_dup 3)
+         (truncate:SI
+            (match_op_dup:DI 4
+              [(match_operand:DI 1 "register_operand")
+               (match_operand:DI 2 "arith_operand")])))])


I take from this that combine + ree fail to do the job?

I must say I'm less than thrilled about the use of truncate instead of simplyproperly representing the sign-extensions that are otherwise required by theABI. RISCV is unlike MIPS in that the 32-bit operations (addw et al) do notrequire sign-extended inputs in order to produce a correct output value.

Consider Alpha as a counter-example to MIPS, where the ABI requiressign-extensions at function edges and the ISA requires sign-extensions forcomparisons.


+(define_insn "*local_pic_storesi<mode>"
+  [(set (mem:ANYF (match_operand 0 "absolute_symbolic_operand" ""))
+       (match_operand:ANYF 1 "register_operand" "f"))
+   (clobber (reg:SI T0_REGNUM))]
+  "TARGET_HARD_FLOAT && !TARGET_64BIT && USE_LOAD_ADDRESS_MACRO (operands[0])"
+  "<store>\t%1,%0,t0"
+  [(set (attr "length") (const_int 8))])


Use match_scratch so that you need not hard-code T0.
And likewise for the other pic stores.

+(define_predicate "const_0_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONST0_RTX (GET_MODE (op))")))


Probably want to support const_wide_int.

+  /* Don't handle multi-word moves this way; we don't want to introduce
+     the individual word-mode moves until after reload.  */
+  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+    return false;

Why not? Do the subreg passes not do a good job? I would expect it to do wellon a target like RISCV.

+(define_predicate "move_operand"
+  (match_operand 0 "general_operand")


Normally this is handled by TARGET_LEGITIMATE_CONSTANT_P.

+/* Target CPU builtins.  */
+#define TARGET_CPU_CPP_BUILTINS()                                      \
+  do                                                                   \
+    {                                                                  \
+      builtin_define ("__riscv");                                    \


Perhaps better to move this to riscv-c.c?

+      if (TARGET_MUL)                                                  \
+       builtin_define ("__riscv_mul");                                       \
+      if (TARGET_DIV)                                                  \
+       builtin_define ("__riscv_div");                                       \
+      if (TARGET_DIV && TARGET_MUL)                                    \
+       builtin_define ("__riscv_muldiv");                            \


Out of curiosity, why are these split when the M extension doesn't do so?

+/* Define this macro if it is advisable to hold scalars in registers
+   in a wider mode than that declared by the program.  In such cases,
+   the value is constrained to be within the bounds of the declared
+   type, but kept valid in the wider mode.  The signedness of the
+   extension may differ from that of the type.  */
+
+#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)    \
+  if (GET_MODE_CLASS (MODE) == MODE_INT                \
+      && GET_MODE_SIZE (MODE) < 4)          \
+    {                                          \
+      (MODE) = Pmode;                          \
+    }


I think you want to force unsignedp false for SImode when extending to DImode.

+/* a0-a7, t0-a6, fa0-fa7, and ft0-ft11 are volatile across calls.
+   The call RTLs themselves clobber ra.  */
+
+#define CALL_USED_REGISTERS                                            \
+{ /* General registers.  */                                            \
+  1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,                      \
+  1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,                      \
+  /* Floating-point registers.  */                                     \
+  1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,                      \
+  1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,                      \
+  /* Others.  */                                                       \
+  1, 1                                                                 \
+}
+
+#define CALL_REALLY_USED_REGISTERS                                     \
+{ /* General registers.  */                                            \
+  1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,                      \
+  1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,                      \
+  /* Floating-point registers.  */                                     \
+  1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,                      \
+  1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,                      \
+  /* Others.  */                                                       \
+  1, 1                                                                 \
+}


You shouldn't have to replicate these if they're the same.

+/* Return coprocessor number from register number.  */
+
+#define COPNUM_AS_CHAR_FROM_REGNUM(REGNO)                              \
+  (COP0_REG_P (REGNO) ? '0' : COP2_REG_P (REGNO) ? '2'                 \
+   : COP3_REG_P (REGNO) ? '3' : '?')
+


MIPS cruft.

+typedef struct {
+  /* Number of integer registers used so far, up to MAX_ARGS_IN_REGISTERS. */
+  unsigned int num_gprs;
+
+  /* Number of floating-point registers used so far, likewise.  */
+  unsigned int num_fprs;
+} CUMULATIVE_ARGS;


Related to something I mentioned for patch 1: why isn't this just a single int?

+(define_c_enum "unspec" [
+  ;; GP manipulation.
+  UNSPEC_EH_RETURN
+
+  ;; Symbolic accesses.  The order of this list must match that of
+  ;; enum riscv_symbol_type in riscv-protos.h.
+  UNSPEC_ADDRESS_FIRST
+  UNSPEC_PCREL
+  UNSPEC_LOAD_GOT
+  UNSPEC_TLS
+  UNSPEC_TLS_LE
+  UNSPEC_TLS_IE
+  UNSPEC_TLS_GD
+
+  UNSPEC_AUIPC
+
+  ;; Register save and restore.
+  UNSPEC_GPR_SAVE
+  UNSPEC_GPR_RESTORE
+
+  ;; Blockage and synchronisation.
+  UNSPEC_BLOCKAGE
+  UNSPEC_FENCE
+  UNSPEC_FENCE_I
+])


Some of these are unspec_volatile, and so should be in define_c_enum "unspecv".

+(define_expand "add<mode>3"
+  [(set (match_operand:GPR 0 "register_operand")
+       (plus:GPR (match_operand:GPR 1 "register_operand")
+                 (match_operand:GPR 2 "arith_operand")))]
+  "")
+
+(define_insn "*addsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+       (plus:SI (match_operand:GPR 1 "register_operand" "r,r")
+                 (match_operand:GPR2 2 "arith_operand" "r,Q")))]
+  ""
+  { return TARGET_64BIT ? "addw\t%0,%1,%2" : "add\t%0,%1,%2"; }
+  [(set_attr "type" "arith")
+   (set_attr "mode" "SI")])


Err... mismatched input modes?  That's wrong.

+
+(define_insn "*adddi3"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+       (plus:DI (match_operand:DI 1 "register_operand" "r,r")
+                 (match_operand:DI 2 "arith_operand" "r,Q")))]
+  "TARGET_64BIT"
+  "add\t%0,%1,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")])


Remove the addsi3 weirdness and it doesn't appear that you need the 
define_expand.

Likewise for sub and mul.

+(define_expand "<u>mulditi3"
+  [(set (match_operand:TI 0 "register_operand")
+       (mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
+                (any_extend:TI (match_operand:DI 2 "register_operand"))))]
+  "TARGET_MUL && TARGET_64BIT"
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_muldi3 (low, operands[1], operands[2]));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_<u>muldi3_highpart (high, operands[1], operands[2]));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})

FWIW, the manual recommends the sequence MULHU; MUL if you need both the highand low parts. Whether that implies that you keep those together as a unituntil final output, or whether you simply emit them in the other order, onewould need to consult an actual hardware manual.


+(define_insn "floatsidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+       (float:DF (match_operand:SI 1 "reg_or_0_operand" "rJ")))]
+  "TARGET_DOUBLE_FLOAT"
+  "fcvt.d.w\t%0,%z1"
+  [(set_attr "type"  "fcvt")
+   (set_attr "mode"  "DF")
+   (set_attr "cnv_mode"      "I2D")])
+
+
+(define_insn "floatdidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+       (float:DF (match_operand:DI 1 "reg_or_0_operand" "rJ")))]
+  "TARGET_64BIT && TARGET_DOUBLE_FLOAT"
+  "fcvt.d.l\t%0,%z1"
+  [(set_attr "type"  "fcvt")
+   (set_attr "mode"  "DF")
+   (set_attr "cnv_mode"      "I2D")])
+


Merge with GPR?  And likewise for other fp conversions.

+;; Instructions for adding the low 16 bits of an address to a register.


paste-o: s/16/12/

+(define_insn "*low<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+       (lo_sum:P (match_operand:P 1 "register_operand" "r")
+                 (match_operand:P 2 "symbolic_operand" "")))]
+  ""
+  "add\t%0,%1,%R2"


Really add, not addi?

+;; Unlike most other insns, the move insns can't be split with '
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.


This comment is incorrect.  In general you can't split non-moves this way 
either.

+;; HImode constant generation; see riscv_move_integer for details.
+;; si+si->hi without truncation is legal because of TRULY_NOOP_TRUNCATION.
+
+(define_insn "add<mode>hi3"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+       (plus:HI (match_operand:HISI 1 "register_operand" "r,r")
+                 (match_operand:HISI 2 "arith_operand" "r,Q")))]
+  ""
+  { return TARGET_64BIT ? "addw\t%0,%1,%2" : "add\t%0,%1,%2"; }
+  [(set_attr "type" "arith")
+   (set_attr "mode" "HI")])
+
+(define_insn "xor<mode>hi3"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+       (xor:HI (match_operand:HISI 1 "register_operand" "r,r")
+                 (match_operand:HISI 2 "arith_operand" "r,Q")))]
+  ""
+  "xor\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "mode" "HI")])

I see from the comment you want to use these for HImode constant generation.However, I do wonder if it wouldn't be better to name these such that theydon't get used during normal rtl expansion.


Why do you support HI and QImode moves into FP registers?

+;; Expand in-line code to clear the instruction cache between operand[0] and
+;; operand[1].
+(define_expand "clear_cache"
+  [(match_operand 0 "pmode_register_operand")
+   (match_operand 1 "pmode_register_operand")]
+  ""
+  "
+{
+  emit_insn (gen_fence_i ());
+  DONE;
+}")


Do you need a FENCE before the FENCE.I?

+(define_insn "call_value_multiple_internal"
+  [(set (match_operand 0 "register_operand" "")
+       (call (mem:SI (match_operand 1 "call_insn_operand" "l,S"))
+             (match_operand 2 "" "")))
+   (set (match_operand 3 "register_operand" "")
+       (call (mem:SI (match_dup 1))
+             (match_dup 2)))

Any reason for this? Your return value registers are sequential. The normalthing to do is just use e.g. (reg:TI 10).

+    case MEMMODEL_SEQ_CST:
+    case MEMMODEL_SYNC_SEQ_CST:
+    case MEMMODEL_ACQ_REL:
+      return "fence rw,rw";
+    case MEMMODEL_ACQUIRE:
+    case MEMMODEL_SYNC_ACQUIRE:
+    case MEMMODEL_CONSUME:
+      return "fence r,rw";
+    case MEMMODEL_RELEASE:
+    case MEMMODEL_SYNC_RELEASE:
+      return "fence rw,w";

The memory models say whether any memory operation can cross the barrier, notwhat kind of operations might cross. I think you have to use rw,rw always.

r~

Re: [PATCH 2/6] RISC-V Port: gcc

Reply via email to