Hi, Just some comments on the fuseable_load_p part, since that's what we were discussing last time.
It looks like this now relies on: Ajit Agarwal <aagar...@linux.ibm.com> writes: > + /* We use DF data flow because we change location rtx > + which is easier to find and modify. > + We use mix of rtl-ssa def-use and DF data flow > + where it is easier. */ > + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN); > + df_analyze (); > + df_set_flags (DF_DEFER_INSN_RESCAN); But please don't do this! For one thing, building DU/UD chains as well as rtl-ssa is really expensive in terms of compile time. But more importantly, modifications need to happen via rtl-ssa to ensure that the IL is kept up-to-date. If we don't do that, later fuse attempts will be based on stale data and so could generate incorrect code. > +// Check whether load can be fusable or not. > +// Return true if fuseable otherwise false. > +bool > +rs6000_pair_fusion::fuseable_load_p (insn_info *info) > +{ > + for (auto def : info->defs()) > + { > + auto set = dyn_cast<set_info *> (def); > + for (auto use1 : set->nondebug_insn_uses ()) > + use1->set_is_live_out_use (true); > + } What was the reason for adding this loop? > + > + rtx_insn *rtl_insn = info ->rtl (); > + rtx body = PATTERN (rtl_insn); > + rtx dest_exp = SET_DEST (body); > + > + if (REG_P (dest_exp) && > + (DF_REG_DEF_COUNT (REGNO (dest_exp)) > 1 The rtl-ssa way of checking this is: crtl->ssa->is_single_dominating_def (...) > + || DF_REG_EQ_USE_COUNT (REGNO (dest_exp)) > 0)) > + return false; Why are uses in notes a problem? In the worst case, we should just be able to remove the note instead. > + > + rtx addr = XEXP (SET_SRC (body), 0); > + > + if (GET_CODE (addr) == PLUS > + && XEXP (addr, 1) && CONST_INT_P (XEXP (addr, 1))) > + { > + if (INTVAL (XEXP (addr, 1)) == -16) > + return false; > + } What's special about -16? > + > + df_ref use; > + df_insn_info *insn_info = DF_INSN_INFO_GET (info->rtl ()); > + FOR_EACH_INSN_INFO_DEF (use, insn_info) > + { > + struct df_link *def_link = DF_REF_CHAIN (use); > + > + if (!def_link || !def_link->ref > + || DF_REF_IS_ARTIFICIAL (def_link->ref)) > + continue; > + > + while (def_link && def_link->ref) > + { > + rtx_insn *insn = DF_REF_INSN (def_link->ref); > + if (GET_CODE (PATTERN (insn)) == PARALLEL) > + return false; Why do you need to skip PARALLELs? > + > + rtx set = single_set (insn); > + if (set == NULL_RTX) > + return false; > + > + rtx op0 = SET_SRC (set); > + rtx_code code = GET_CODE (op0); > + > + // This check is added as register pairs are not generated > + // by RA for neg:V2DF (fma: V2DF (reg1) > + // (reg2) > + // (neg:V2DF (reg3))) > + if (GET_RTX_CLASS (code) == RTX_UNARY) > + return false; What's special about (neg (fma ...))? > + > + def_link = def_link->next; > + } > + } > + return true; > +} Thanks, Richard