> -----Original Message----- > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > Sent: Monday, April 20, 2015 3:10 PM > To: Sameera Deshpande; Moore, Catherine > Cc: Richard Sandiford; gcc-patches@gcc.gnu.org; echri...@gmail.com > Subject: RE: [PATCH][MIPS] Enable load-load/store-store bonding > > Sameera Deshpande <sameera.deshpa...@imgtec.com> writes: > > Gentle reminder! > > Thanks Sameera. Just a couple of comments inline below and a question for > Catherine at the end. > > > - Thanks and regards, > > Sameera D. > > > > On Monday 30 March 2015 04:58 PM, sameera wrote: > > > Hi! > > > > > > Sorry for delay in sending this patch for review. > > > Please find attached updated patch. > > > > > > In P5600, 2 consecutive loads/stores of same type which access > > > contiguous memory locations are bonded together by instruction issue > > > unit to dispatch single load/store instruction which accesses both > > locations. This allows 2X improvement in memory intensive code. This > > optimization can be performed for LH, SH, LW, SW, LWC, SWC, LDC, SDC > > instructions. > > > > > > This patch adds peephole2 patterns to identify such loads/stores, > > > and put them in parallel, so that the scheduler will not split it - > > thereby guaranteeing h/w level load/store bonding. > > > > > > The patch is tested with dejagnu for correctness, and tested on > > hardware for performance. > > > Ok for trunk? > > > > > > Changelog: > > > gcc/ > > > * config/mips/mips.md (JOIN_MODE): New mode iterator. > > > (join2_load_Store<JOIN_MODE:mode>): New pattern. > > > (join2_loadhi): Likewise. > > > (define_peehole2): Add peephole2 patterns to join 2 > > > HI/SI/SF/DF- > > mode > > > load-load and store-stores. > > > * config/mips/mips.opt (mload-store-pairs): New option. > > > (TARGET_LOAD_STORE_PAIRS): New macro. > > > *config/mips/mips.h (ENABLE_LD_ST_PAIRS): Likewise. > > > *config/mips/mips-protos.h (mips_load_store_bonding_p): New > > prototype. > > > *config/mips/mips.c(mips_load_store_bonding_p): New function. > > I don't know if this has been corrupted by mail clients but a single space > after > '*' and a space before '('. > > >diff --git a/gcc/config/mips/mips-protos.h > >b/gcc/config/mips/mips-protos.h index b48e04f..244eb8d 100644 > >--- a/gcc/config/mips/mips-protos.h > >+++ b/gcc/config/mips/mips-protos.h > >@@ -360,6 +360,7 @@ extern bool mips_epilogue_uses (unsigned int); > >extern void mips_final_prescan_insn (rtx_insn *, rtx *, int); extern > >int mips_trampoline_code_size (void); extern void > >mips_function_profiler (FILE *); > >+extern bool mips_load_store_bonding_p (rtx *, machine_mode, bool); > > > > typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx); #ifdef RTX_CODE diff > >--git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index > >1733457..85f0591 100644 > >--- a/gcc/config/mips/mips.c > >+++ b/gcc/config/mips/mips.c > >@@ -18241,6 +18241,64 @@ umips_load_store_pair_p_1 (bool load_p, > bool swap_p, > > return true; > > } > > > >+bool > >+mips_load_store_bonding_p (rtx *operands, enum machine_mode > mode, bool > >+load_p) > > Remove enum from machine_mode. > > >+{ > >+ rtx reg1, reg2, mem1, mem2, base1, base2; > >+ enum reg_class rc1, rc2; > >+ HOST_WIDE_INT offset1, offset2; > >+ > >+ if (load_p) > >+ { > >+ reg1 = operands[0]; > >+ reg2 = operands[2]; > >+ mem1 = operands[1]; > >+ mem2 = operands[3]; > >+ } > >+ else > >+ { > >+ reg1 = operands[1]; > >+ reg2 = operands[3]; > >+ mem1 = operands[0]; > >+ mem2 = operands[2]; > >+ } > >+ > >+ if (mips_address_insns (XEXP (mem1, 0), mode, false) == 0 > >+ || mips_address_insns (XEXP (mem2, 0), mode, false) == 0) > >+ return false; > >+ > >+ mips_split_plus (XEXP (mem1, 0), &base1, &offset1); mips_split_plus > >+ (XEXP (mem2, 0), &base2, &offset2); > >+ > >+ /* Base regs do not match. */ > >+ if (!REG_P (base1) || !rtx_equal_p (base1, base2)) > >+ return false; > >+ > >+ /* Either of the loads is clobbering base register. */ > >+ if (load_p > >+ && (REGNO (reg1) == REGNO (base1) > >+ || (REGNO (reg2) == REGNO (base1)))) > >+ return false; > > Can you add a comment saying that this case does not get bonded by any > known hardware even though it could be valid to bond them if it is the > second load that clobbers the base. > > >+ /* Loading in same registers. */ > >+ if (load_p > >+ && REGNO (reg1) == REGNO (reg2)) > >+ return false; > >+ > >+ /* The loads/stores are not of same type. */ > >+ rc1 = REGNO_REG_CLASS (REGNO (reg1)); > >+ rc2 = REGNO_REG_CLASS (REGNO (reg2)); if (rc1 != rc2 > >+ && !reg_class_subset_p (rc1, rc2) > >+ && !reg_class_subset_p (rc2, rc1)) > >+ return false; > >+ > >+ if (abs (offset1 - offset2) != GET_MODE_SIZE (mode)) > >+ return false; > >+ > >+ return true; > >+} > >+ > > /* OPERANDS describes the operands to a pair of SETs, in the order > > dest1, src1, dest2, src2. Return true if the operands can be used > > in an LWP or SWP instruction; LOAD_P says which. */ diff --git > >a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index > >ec69ed5..1bd0dae 100644 > >--- a/gcc/config/mips/mips.h > >+++ b/gcc/config/mips/mips.h > >@@ -3147,3 +3147,7 @@ extern GTY(()) struct target_globals > >*mips16_globals; #define STANDARD_STARTFILE_PREFIX_1 "/lib64/" > > #define STANDARD_STARTFILE_PREFIX_2 "/usr/lib64/" > > #endif > >+ > >+#define ENABLE_LD_ST_PAIRS \ > >+ (TARGET_LOAD_STORE_PAIRS && TUNE_P5600 \ > >+ && !TARGET_MICROMIPS && !TARGET_FIX_24K) > > I've already forgotten why these extra micromips/fix24k conditions were > required. Can you add a comment? > > >diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index > >3672c0b..cfdb750 100644 > >--- a/gcc/config/mips/mips.md > >+++ b/gcc/config/mips/mips.md > >@@ -754,6 +754,11 @@ > > > > (define_mode_iterator MOVEP1 [SI SF]) > > (define_mode_iterator MOVEP2 [SI SF]) > >+(define_mode_iterator JOIN_MODE [HI > >+ SI > >+ (SF "TARGET_HARD_FLOAT") > >+ (DF "TARGET_HARD_FLOAT > >+ && TARGET_DOUBLE_FLOAT")]) > > > > ;; This mode iterator allows :HILO to be used as the mode of the ;; > >concatenated HI and LO registers. > >@@ -7419,6 +7424,108 @@ > > { return MIPS_CALL ("jal", operands, 0, -1); } > > [(set_attr "type" "call") > > (set_attr "insn_count" "3")]) > >+ > >+(define_insn "*join2_load_store<JOIN_MODE:mode>" > >+ [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand" > "=d,f,m,m") > >+ (match_operand:JOIN_MODE 1 "nonimmediate_operand" > "m,m,d,f")) > >+ (set (match_operand:JOIN_MODE 2 "nonimmediate_operand" > "=d,f,m,m") > >+ (match_operand:JOIN_MODE 3 "nonimmediate_operand" > "m,m,d,f"))] > >+ "ENABLE_LD_ST_PAIRS && reload_completed" > >+ { > >+ bool load_p = (which_alternative == 0 || which_alternative == 1); > >+ /* Reg-renaming pass reuses base register if it is dead after bonded > loads. > >+ Hardware does not bond those loads, even when they are > consecutive. > >+ However, order of the loads need to be checked for correctness. */ > >+ if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1])) > >+ { > >+ output_asm_insn (mips_output_move (operands[0], operands[1]), > >+ operands); > >+ output_asm_insn (mips_output_move (operands[2], operands[3]), > >+ &operands[2]); > >+ } > >+ else > >+ { > >+ output_asm_insn (mips_output_move (operands[2], operands[3]), > >+ &operands[2]); > >+ output_asm_insn (mips_output_move (operands[0], operands[1]), > >+ operands); > >+ } > >+ return ""; > >+ } > >+ [(set_attr "move_type" "load,fpload,store,fpstore") > >+ (set_attr "insn_count" "2,2,2,2")]) > >+ > >+;; 2 HI/SI/SF/DF loads are joined. > >+;; P5600 does not support bonding of two LBs, hence QI mode is not > included. > >+(define_peephole2 > >+ [(set (match_operand:JOIN_MODE 0 "register_operand") > >+ (match_operand:JOIN_MODE 1 "non_volatile_mem_operand")) > >+ (set (match_operand:JOIN_MODE 2 "register_operand") > >+ (match_operand:JOIN_MODE 3 "non_volatile_mem_operand"))] > > Please can you comment that the loads must be non-volatile as they may get > re-ordered. > > >+ "ENABLE_LD_ST_PAIRS && > > && on the next line > > >+ mips_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, > true)" > >+ [(parallel [(set (match_dup 0) > >+ (match_dup 1)) > >+ (set (match_dup 2) > >+ (match_dup 3))])] > >+ "") > >+ > >+;; 2 HI/SI/SF/DF stores are joined. > >+;; P5600 does not support bonding of two SBs, hence QI mode is not > included. > >+(define_peephole2 > >+ [(set (match_operand:JOIN_MODE 0 "memory_operand") > >+ (match_operand:JOIN_MODE 1 "register_operand")) > >+ (set (match_operand:JOIN_MODE 2 "memory_operand") > >+ (match_operand:JOIN_MODE 3 "register_operand"))] > >+ "ENABLE_LD_ST_PAIRS && > > && on the next line > > >+ mips_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, > false)" > >+ [(parallel [(set (match_dup 0) > >+ (match_dup 1)) > >+ (set (match_dup 2) > >+ (match_dup 3))])] > >+ "") > >+ > >+(define_insn "*join2_loadhi" > >+ [(set (match_operand:SI 0 "register_operand" "=r") > >+ (any_extend:SI (match_operand:HI 1 "non_volatile_mem_operand" > "m"))) > >+ (set (match_operand:SI 2 "register_operand" "=r") > >+ (any_extend:SI (match_operand:HI 3 "non_volatile_mem_operand" > "m")))] > >+ "ENABLE_LD_ST_PAIRS && reload_completed" > >+ { > >+ /* Reg-renaming pass reuses base register if it is dead after bonded > loads. > >+ Hardware does not bond those loads, even when they are > consecutive. > >+ However, order of the loads need to be checked for correctness. */ > >+ if (!reg_overlap_mentioned_p (operands[0], operands[1])) > >+ { > >+ output_asm_insn ("lh<u>\t%0,%1", operands); > >+ output_asm_insn ("lh<u>\t%2,%3", operands); > >+ } > >+ else > >+ { > >+ output_asm_insn ("lh<u>\t%2,%3", operands); > >+ output_asm_insn ("lh<u>\t%0,%1", operands); > >+ } > >+ > >+ return ""; > >+ } > >+ [(set_attr "move_type" "load") > >+ (set_attr "insn_count" "2")]) > >+ > >+ > >+;; 2 16 bit integer loads are joined. > > 2 HI mode loads > > >+(define_peephole2 > >+ [(set (match_operand:SI 0 "register_operand") > >+ (any_extend:SI (match_operand:HI 1 > "non_volatile_mem_operand"))) > >+ (set (match_operand:SI 2 "register_operand") > >+ (any_extend:SI (match_operand:HI 3 > "non_volatile_mem_operand")))] > >+ "ENABLE_LD_ST_PAIRS && > >+ mips_load_store_bonding_p (operands, HImode, true)" > >+ [(parallel [(set (match_dup 0) > >+ (any_extend:SI (match_dup 1))) > >+ (set (match_dup 2) > >+ (any_extend:SI (match_dup 3)))])] > >+ "") > >+ > > > > ;; Synchronization instructions. > > > >diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt index > >9e89aa9..a9baebe 100644 > >--- a/gcc/config/mips/mips.opt > >+++ b/gcc/config/mips/mips.opt > >@@ -418,3 +418,7 @@ Enable use of odd-numbered single-precision > >registers > > > > noasmopt > > Driver > >+ > >+mload-store-pairs > >+Target Report Var(TARGET_LOAD_STORE_PAIRS) Init(1) Enable load/store > >+bonding. > > Catherine: We have this option in place just as a get-out clause if there are > any side effects that have been missed in this patch such that you can still > tune for p5600 but with bonding disabled. Do you think this is OK? I'm not > completely against this being either undocumented or removed entirely.
If we allow the patch to go in, then it needs to be documented. I don't have a strong opinion either way. If you find it helpful to allow the option, then I'm OK with that. Sameer, will please add a test case to the patch? Thanks, Catherine > > Sameera: Assuming we keep it then it needs adding to the invoke doc. > > Thanks, > Matthew