On Thu, Aug 20, 2020 at 09:09:40PM -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Aug 18, 2020 at 02:34:01AM -0400, Michael Meissner wrote: > > +// Maximum number of insns to scan between the load address and the load > > that > > Please don't mix /* and // comments. Just stick to /* comments, like > all the rest of our backend? > > > +const int MAX_PCREL_OPT_INSNS = 10; > > "static const", and not tab please (just a space).
With the switch to using flow control, this will go away. > > + // LWA is a DS format instruction, but LWZ is a D format instruction. > > We use > > + // DImode for the mode to force checking whether the bottom 2 bits are 0. > > How so? Can't you just check if the resulting insn is accepted? That > would be so much more robust (and can have a correct description more > easily, too!) Unfortunately no. The prefixed form will be accepted, and the whole PCREL_OPT optimization requires a non-prefixed instruction. > > + // However FPR and vector registers uses the LFIWAX instruction which is > > + // indexed only. > > (vectors use lxsiwax) Yes. > > + if (GET_CODE (mem) == SIGN_EXTEND && GET_MODE (XEXP (mem, 0)) == SImode) > > You're checking here whether the address of the MEM is SImode. Whoops. > > + { > > + if (!INT_REGNO_P (reg_regno)) > > That should use base_reg_operand instead? > > > + // The optimization will only work on non-prefixed offsettable loads. > > + rtx addr = XEXP (mem_inner, 0); > > + enum insn_form iform = address_to_insn_form (addr, mem_mode, > > non_prefixed); > > + if (iform != INSN_FORM_BASE_REG > > + && iform != INSN_FORM_D > > + && iform != INSN_FORM_DS > > + && iform != INSN_FORM_DQ) > > + return false; > > Sounds like something for a utility function? Do we use this elsewhere? I will make a utility function. But no, we currently do not use this anywhere else. It might be useful for fusion. > > + ++pcrel_opt_next_num; > > Normal style is postfix increment. Empty lines around this are nice, it > indicates this is the point where we decided to do the thing. Or add a > comment even, maybe? > > > + unsigned int addr_regno = reg_or_subregno (addr_reg); > > + rtx label_num = GEN_INT (pcrel_opt_next_num); > > + rtx reg_di = gen_rtx_REG (DImode, reg_regno); > > + > > + PATTERN (addr_insn) > > + = ((addr_regno != reg_regno) > > + ? gen_pcrel_opt_ld_addr (addr_reg, addr_symbol, label_num, reg_di) > > + : gen_pcrel_opt_ld_addr_same_reg (addr_reg, addr_symbol, > > label_num)); > > Please use if() instead of multi-line expressions. The outer parens are > unnecessary, too. Ok. However, the outer parens make it easier to format in emacs. > > + // Revalidate the insn, backing out of the optimization if the insn is > > not > > + // supported. > > So the count will be incorrect then? No, the count is correct, since we don't count the PCREL_OPTs until everything is validated. > > > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > > + if (INSN_CODE (addr_insn) < 0) > > + { > > + PATTERN (addr_insn) = addr_set; > > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > > + return false; > > + } > > Can you use the normal undo mechanisms, instead? apply_change_group > etc. I will think about it. > > +static rtx_insn * > > +next_active_insn_in_basic_block (rtx_insn *insn) > > +{ > > + insn = NEXT_INSN (insn); > > + > > + while (insn != NULL_RTX) > > + { > > + /* If the basic block ends or there is a jump of some kind, exit the > > + loop. */ > > + if (CALL_P (insn) > > + || JUMP_P (insn) > > + || JUMP_TABLE_DATA_P (insn) > > + || LABEL_P (insn) > > + || BARRIER_P (insn)) > > + return NULL; > > + > > + /* If this is a real insn, return it. */ > > + if (!insn->deleted () > > + && NONJUMP_INSN_P (insn) > > + && GET_CODE (PATTERN (insn)) != USE > > + && GET_CODE (PATTERN (insn)) != CLOBBER) > > + return insn; > > + > > + /* Loop for USE, CLOBBER, DEBUG_INSN, NOTEs. */ > > + insn = NEXT_INSN (insn); > > + } > > + > > + return NULL; > > +} > > There are things to do this. Please don't write code manually parsing > RTL unless you have to. I will switch to using the DF flow in the next set of patches. > > +// Validate that a load is actually a single instruction that can be > > optimized > > +// with the PCREL_OPT optimization. > > + > > +static bool > > +is_single_instruction (rtx_insn *insn, rtx reg) > > Of course it is a single instruction! A single RTL instruction... > Clarify as "single machine instruction"? My experience is some of the insns lie in terms of things like size. The size is set when it is split, but this pass may/may not run before those instructions are split. But I will change it to just testing if the length is 4, and hope for the best. > > +{ > > + if (!REG_P (reg) && !SUBREG_P (reg)) > > + return false; > > You need to check if is a subreg of reg, then. There are subregs of > other things. Not that you should see those here, but still. Good point, but in theory outside of register allocation, you should never see it. > > > + // _Decimal128 and IBM extended double are always multiple instructions. > > + machine_mode mode = GET_MODE (reg); > > + if (mode == TFmode && !TARGET_IEEEQUAD) > > + return false; > > + > > + if (mode == TDmode || mode == IFmode) > > + return false; > > Don't we have a helper for this? The ibm128 part at least. Yes, but it doesn't cover TD. > Maybe we should just have an attribute on the insns that *can* get this > optimisation? > > > + return (base_reg_operand (XEXP (addr, 0), Pmode) > > + && satisfies_constraint_I (XEXP (addr, 1))); > > short_cint_operand. But maybe just rs6000_legitimate_offset_address_p? > > > /* Flags that need to be turned off if -mno-power10. */ > > #define OTHER_POWER10_MASKS (OPTION_MASK_MMA > > \ > > | OPTION_MASK_PCREL \ > > + | OPTION_MASK_PCREL_OPT \ > > | OPTION_MASK_PREFIXED) > > This isn't a CPU flag. Instead, it is just an optimisation option. But you want it turned off if somebody says __attribute__((target("cpu=power8"))), which is what OTHER_POWER10_MASKS does. > > @@ -8515,7 +8522,10 @@ rs6000_delegitimize_address (rtx orig_x) > > { > > rtx x, y, offset; > > > > - if (GET_CODE (orig_x) == UNSPEC && XINT (orig_x, 1) == UNSPEC_FUSION_GPR) > > + if (GET_CODE (orig_x) == UNSPEC > > + && (XINT (orig_x, 1) == UNSPEC_FUSION_GPR > > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR > > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) > > orig_x = XVECEXP (orig_x, 0, 0); > > > > orig_x = delegitimize_mem_from_attrs (orig_x); > > Why this? It needs an explanation (in the code). It is needed in some cases to get back the original SYMBOL_REF (particularly for debugging). > > @@ -13197,6 +13207,19 @@ print_operand (FILE *file, rtx x, int code) > > fprintf (file, "%d", 128 >> (REGNO (x) - CR0_REGNO)); > > return; > > > > + case 'r': > > + /* X is a label number for the PCREL_OPT optimization. Emit the > > .reloc > > + to enable this optimization, unless the value is 0. */ > > + gcc_assert (CONST_INT_P (x)); > > + if (UINTVAL (x) != 0) > > + { > > + unsigned int label_num = UINTVAL (x); > > + fprintf (file, > > + ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t", > > + label_num, label_num); > > + } > > + return; > > Don't eat output modifiers please. We have only so few left, and we > cannot recycle any more without pain. > > I don't see why we cannot just do this in the normal output (C) code of > the one or few insns that want this? Well it is quite a few insns in pcrel-opt.md. I will look at changing it. Part of it is historical reasons in that it was needed in the old version of the patches that could be turned off. > > ;; The ISA we implement. > > -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" > > +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,pcrel_opt" > > (const_string "any")) > > No. Please read the heading. Ok, I will just use p10. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797