Hi! On Tue, Aug 18, 2020 at 02:34:01AM -0400, Michael Meissner wrote: > +// Maximum number of insns to scan between the load address and the load that
Please don't mix /* and // comments. Just stick to /* comments, like all the rest of our backend? > +const int MAX_PCREL_OPT_INSNS = 10; "static const", and not tab please (just a space). > + // LWA is a DS format instruction, but LWZ is a D format instruction. We > use > + // DImode for the mode to force checking whether the bottom 2 bits are 0. How so? Can't you just check if the resulting insn is accepted? That would be so much more robust (and can have a correct description more easily, too!) > + // However FPR and vector registers uses the LFIWAX instruction which is > + // indexed only. (vectors use lxsiwax) > + if (GET_CODE (mem) == SIGN_EXTEND && GET_MODE (XEXP (mem, 0)) == SImode) You're checking here whether the address of the MEM is SImode. > + { > + if (!INT_REGNO_P (reg_regno)) That should use base_reg_operand instead? > + // The optimization will only work on non-prefixed offsettable loads. > + rtx addr = XEXP (mem_inner, 0); > + enum insn_form iform = address_to_insn_form (addr, mem_mode, non_prefixed); > + if (iform != INSN_FORM_BASE_REG > + && iform != INSN_FORM_D > + && iform != INSN_FORM_DS > + && iform != INSN_FORM_DQ) > + return false; Sounds like something for a utility function? Do we use this elsewhere? > + ++pcrel_opt_next_num; Normal style is postfix increment. Empty lines around this are nice, it indicates this is the point where we decided to do the thing. Or add a comment even, maybe? > + unsigned int addr_regno = reg_or_subregno (addr_reg); > + rtx label_num = GEN_INT (pcrel_opt_next_num); > + rtx reg_di = gen_rtx_REG (DImode, reg_regno); > + > + PATTERN (addr_insn) > + = ((addr_regno != reg_regno) > + ? gen_pcrel_opt_ld_addr (addr_reg, addr_symbol, label_num, reg_di) > + : gen_pcrel_opt_ld_addr_same_reg (addr_reg, addr_symbol, label_num)); Please use if() instead of multi-line expressions. The outer parens are unnecessary, too. > + // Revalidate the insn, backing out of the optimization if the insn is not > + // supported. So the count will be incorrect then? > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > + if (INSN_CODE (addr_insn) < 0) > + { > + PATTERN (addr_insn) = addr_set; > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > + return false; > + } Can you use the normal undo mechanisms, instead? apply_change_group etc. > +static rtx_insn * > +next_active_insn_in_basic_block (rtx_insn *insn) > +{ > + insn = NEXT_INSN (insn); > + > + while (insn != NULL_RTX) > + { > + /* If the basic block ends or there is a jump of some kind, exit the > + loop. */ > + if (CALL_P (insn) > + || JUMP_P (insn) > + || JUMP_TABLE_DATA_P (insn) > + || LABEL_P (insn) > + || BARRIER_P (insn)) > + return NULL; > + > + /* If this is a real insn, return it. */ > + if (!insn->deleted () > + && NONJUMP_INSN_P (insn) > + && GET_CODE (PATTERN (insn)) != USE > + && GET_CODE (PATTERN (insn)) != CLOBBER) > + return insn; > + > + /* Loop for USE, CLOBBER, DEBUG_INSN, NOTEs. */ > + insn = NEXT_INSN (insn); > + } > + > + return NULL; > +} There are things to do this. Please don't write code manually parsing RTL unless you have to. > +// Validate that a load is actually a single instruction that can be > optimized > +// with the PCREL_OPT optimization. > + > +static bool > +is_single_instruction (rtx_insn *insn, rtx reg) Of course it is a single instruction! A single RTL instruction... Clarify as "single machine instruction"? > +{ > + if (!REG_P (reg) && !SUBREG_P (reg)) > + return false; You need to check if is a subreg of reg, then. There are subregs of other things. Not that you should see those here, but still. > + // _Decimal128 and IBM extended double are always multiple instructions. > + machine_mode mode = GET_MODE (reg); > + if (mode == TFmode && !TARGET_IEEEQUAD) > + return false; > + > + if (mode == TDmode || mode == IFmode) > + return false; Don't we have a helper for this? The ibm128 part at least. Maybe we should just have an attribute on the insns that *can* get this optimisation? > + return (base_reg_operand (XEXP (addr, 0), Pmode) > + && satisfies_constraint_I (XEXP (addr, 1))); short_cint_operand. But maybe just rs6000_legitimate_offset_address_p? > /* Flags that need to be turned off if -mno-power10. */ > #define OTHER_POWER10_MASKS (OPTION_MASK_MMA \ > | OPTION_MASK_PCREL \ > + | OPTION_MASK_PCREL_OPT \ > | OPTION_MASK_PREFIXED) This isn't a CPU flag. Instead, it is just an optimisation option. > @@ -8515,7 +8522,10 @@ rs6000_delegitimize_address (rtx orig_x) > { > rtx x, y, offset; > > - if (GET_CODE (orig_x) == UNSPEC && XINT (orig_x, 1) == UNSPEC_FUSION_GPR) > + if (GET_CODE (orig_x) == UNSPEC > + && (XINT (orig_x, 1) == UNSPEC_FUSION_GPR > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) > orig_x = XVECEXP (orig_x, 0, 0); > > orig_x = delegitimize_mem_from_attrs (orig_x); Why this? It needs an explanation (in the code). > @@ -13197,6 +13207,19 @@ print_operand (FILE *file, rtx x, int code) > fprintf (file, "%d", 128 >> (REGNO (x) - CR0_REGNO)); > return; > > + case 'r': > + /* X is a label number for the PCREL_OPT optimization. Emit the .reloc > + to enable this optimization, unless the value is 0. */ > + gcc_assert (CONST_INT_P (x)); > + if (UINTVAL (x) != 0) > + { > + unsigned int label_num = UINTVAL (x); > + fprintf (file, > + ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t", > + label_num, label_num); > + } > + return; Don't eat output modifiers please. We have only so few left, and we cannot recycle any more without pain. I don't see why we cannot just do this in the normal output (C) code of the one or few insns that want this? > ;; The ISA we implement. > -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" > +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,pcrel_opt" > (const_string "any")) No. Please read the heading. Segher