On Fri, Jun 9, 2023 at 3:57 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 5/25/23 06:35, Manolis Tsamis wrote: > > Implementation of the new RISC-V optimization pass for memory offset > > calculations, documentation and testcases. > > > > gcc/ChangeLog: > > > > * config.gcc: Add riscv-fold-mem-offsets.o to extra_objs. > > * config/riscv/riscv-passes.def (INSERT_PASS_AFTER): Schedule a new > > pass. > > * config/riscv/riscv-protos.h (make_pass_fold_mem_offsets): Declare. > > * config/riscv/riscv.opt: New options. > > * config/riscv/t-riscv: New build rule. > > * doc/invoke.texi: Document new option. > > * config/riscv/riscv-fold-mem-offsets.cc: New file. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/fold-mem-offsets-1.c: New test. > > * gcc.target/riscv/fold-mem-offsets-2.c: New test. > > * gcc.target/riscv/fold-mem-offsets-3.c: New test. > So a followup. > > While I think we probably could create a variety of backend patterns, > perhaps disallow the frame pointer as the addend argument to a shadd > pattern and the like and capture the key cases from mcf and probably > deepsjeng it's probably not the best direction. > > What I suspect would ultimately happen is we'd be presented with > additional cases over time that would require an ever increasing number > of patterns. sign vs zero extension, increasing depth of search space > to find reassociation opportunities, different variants with and without > shadd/zbb, etc etc. > > So with that space explored a bit the next big question is target > specific or generic. I'll poke in there a it over the coming days. In > the mean time I do have some questions/comments on the code itself. > There may be more over time.. > > > > > +static rtx_insn* > > +get_single_def_in_bb (rtx_insn *insn, rtx reg) > [ ... ] > > > > + for (ref_link = ref_chain; ref_link; ref_link = ref_link->next) > > + { > > + /* Problem getting some definition for this instruction. */ > > + if (ref_link->ref == NULL) > > + return NULL; > > + if (DF_REF_INSN_INFO (ref_link->ref) == NULL) > > + return NULL; > > + if (global_regs[REGNO (reg)] > > + && !set_of (reg, DF_REF_INSN (ref_link->ref))) > > + return NULL; > > + } > That last condition feels a bit odd. It would seem that you wanted an > OR boolean rather than AND. >
Most of this function I didn't write by myself, I used existing code to get definitions taken from REE's get_defs. In the original there's a comment about this line this comment that explains it: As global regs are assumed to be defined at each function call dataflow can report a call_insn as being a definition of REG. But we can't do anything with that in this pass so proceed only if the instruction really sets REG in a way that can be deduced from the RTL structure. This function is the only one I copied without changing much (because I didn't quite understand it), so I don't know if that condition is any useful for f-m-o. Also the code duplication here is a bit unfortunate, maybe it would be preferred to create a generic version that can be used in both? > > > + > > + unsigned int dest_regno = REGNO (dest); > > + > > + /* We don't want to fold offsets from instructions that change some > > + particular registers with potentially global side effects. */ > > + if (!GP_REG_P (dest_regno) > > + || dest_regno == STACK_POINTER_REGNUM > > + || (frame_pointer_needed && dest_regno == HARD_FRAME_POINTER_REGNUM) > > + || dest_regno == GP_REGNUM > > + || dest_regno == THREAD_POINTER_REGNUM > > + || dest_regno == RETURN_ADDR_REGNUM) > > + return 0; > I'd think most of this would be captured by testing fixed_registers > rather than trying to list each register individually. In fact, if we > need to generalize this to work on other targets we almost certainly > want a more general test. > Thanks, I knew there would be some proper way to test this but wasn't aware which is the correct one. Should this look like below? Or is the GP_REG_P redundant and just fixed_regs will do? if (!GP_REG_P (dest_regno) || fixed_regs[dest_regno]) return 0; > > > + else if (( > > + GET_CODE (src) == SIGN_EXTEND > > + || GET_CODE (src) == ZERO_EXTEND > > + ) > > + && MEM_P (XEXP (src, 0))) > Formatting is goofy above... > Noted. > > > > + > > + if (dump_file) > > + { > > + fprintf (dump_file, "Instruction deleted from folding:"); > > + print_rtl_single (dump_file, insn); > > + } > > + > > + if (REGNO (dest) != REGNO (arg1)) > > + { > > + /* If the dest register is different than the fisrt argument > > + then the addition with constant 0 is equivalent to a move > > + instruction. We emit the move and let the subsequent > > + pass cprop_hardreg eliminate that if possible. */ > > + rtx arg1_reg_rtx = gen_rtx_REG (GET_MODE (dest), REGNO (arg1)); > > + rtx mov_rtx = gen_move_insn (dest, arg1_reg_rtx); > > + df_insn_rescan (emit_insn_after (mov_rtx, insn)); > > + } > > + > > + /* If the dest register is the same with the first argument > > + then the addition with constant 0 is a no-op. > > + We can now delete the original add immidiate instruction. */ > > + delete_insn (insn); > The debugging message is a bit misleading. Yea, we always delete > something here, but in one case we end up emitting a copy. > Indeed. Maybe "Instruction reduced to move: ..."? > > > > + > > + /* Temporarily change the offset in MEM to test whether > > + it results in a valid instruction. */ > > + machine_mode mode = GET_MODE (mem_addr); > > + XEXP (mem, 0) = gen_rtx_PLUS (mode, reg, GEN_INT (offset)); > > + > > + bool valid_change = recog (PATTERN (insn), insn, 0) >= 0; > > + > > + /* Restore the instruction. */ > > + XEXP (mem, 0) = mem_addr; > You need to reset the INSN_CODE after restoring the instruction. That's > generally a bad thing to do, but I've seen it done enough (and been > guilty myself in the past) that we should just assume some ports are > broken in this regard. > Ok thanks, I didn't knew that one. > > Anyway, just wanted to get those issues raised so that you can address them. > > jeff