On Fri, 23 Aug 2019, Uros Bizjak wrote: > On Thu, Aug 22, 2019 at 3:35 PM Richard Biener <rguent...@suse.de> wrote: > > > > > > This fixes quadraticness in STV and makes > > > > machine dep reorg : 89.07 ( 95%) 0.02 ( 18%) 89.10 ( > > 95%) 54 kB ( 0%) > > > > drop to zero. Anybody remembers why it is the way it is now? > > > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > > > OK? > > Looking at the PR, comment #3 [1], I assume this patch is obsoltete > and will be updated? > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91522#c3
Yes. I'm still learning how STV operates (and learing DF and RTL...). The following is a rewrite of the non-TImode chain conversion according to I think how it should operate als allowing the hunk that fixes the compile-time and fixing PR91527 on the way (which I ran into during more extensive testing of the patch myself). So compared to the state before which I still do not 100% understand we now do the following. Chain detection works as before including recording of all defs (both defined by the insns in the chain and insns outside) that need copy-in or copy-out operations. But then the patch changes things as to guarantee that after the conversion all uses/defs of a pseudo are of the (subreg:Vmode ..) form or of the original scalar form. In particular it avoids the need to change any insns that are not part of the chain (besides emitting the extra copy instructions). For this all defs that were marked as needing copies (thus they have uses/defs both in the chain and outside) the chain will use a new pseudo that we copy to from scalar sources and that we copy from for scalar uses. There's the new defs_map which records the mapping of old to new reg. pseudos that are only used in the chain already are not remapped. The conversion itself then happens in two stages, first, in make_vector_copies, we emit the copy-in insns and allocate all pseudos we need. Then the rest of the conversion happens fully inside of convert_insn where we generate the copy-outs of the insns def, replace defs and uses according to the mapping and replace uses and defs with the (subreg:Vmode ..) style. For PR91527 IRA doesn't like the REG_EQUIV note in (insn 4 24 5 2 (set (subreg:V4SI (reg/v:SI 90 [ c ]) 0) (subreg:V4SI (reg:SI 100) 0)) "/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr21463.C":11:4 1248 {movv4si_internal} (expr_list:REG_DEAD (reg:SI 100) (expr_list:REG_EQUIV (mem/c:SI (plus:DI (reg/f:DI 16 argp) (const_int 16 [0x10])) [1 c+0 S4 A64]) (nil)))) because the SET_DEST is not a REG_P. I'm not sure if this is invalid RTL, docs say SET_DEST might be a strict_low_part or a zero_extract but doesn't mention a subreg. So I opted to simply remove equal/equiv notes on insns we convert and since the above has a REG_DEAD note I took the liberty to update that according to the mapping (so that would have been not needed before this patch) rather than dropping it. Bootstrapped with and without --with-march=westmere (to get some STV coverage, this included all languages) on x88_64-unknown-linux-gnu, testing in progress. OK if testing succeeds? It still solves the compile-time issue (which is a latent issue, btw, and with a carefully crafted testcase can be triggered since STV exists for DImode chains with !TARGET_64BIT). Thanks, Richard. 2019-08-22 Richard Biener <rguent...@suse.de> PR target/91522 PR target/91527 * config/i386/i386-features.h (general_scalar_chain::defs_map): New member. (general_scalar_chain::replace_with_subreg): Remove. (general_scalar_chain::replace_with_subreg_in_insn): Likewise. (general_scalar_chain::convert_reg): Adjust signature. * config/i386/i386-features.c (scalar_chain::add_insn): Do not iterate over all defs of a reg. (general_scalar_chain::replace_with_subreg): Remove. (general_scalar_chain::replace_with_subreg_in_insn): Likewise. (general_scalar_chain::make_vector_copies): Populate defs_map, place copy only after defs that are used as vectors in the chain. (general_scalar_chain::convert_reg): Emit a copy for a specific def in a specific instruction. (general_scalar_chain::convert_op): All reg uses are converted here. (general_scalar_chain::convert_insn): Emit copies for scalar uses of defs here. Replace uses with the copies we created. Replace and convert the def. Adjust REG_DEAD notes, remove REG_EQUIV/EQUAL notes. (general_scalar_chain::convert_registers): Only handle copies into the chain here. Index: gcc/config/i386/i386-features.c =================================================================== --- gcc/config/i386/i386-features.c (revision 274843) +++ gcc/config/i386/i386-features.c (working copy) @@ -416,13 +416,9 @@ scalar_chain::add_insn (bitmap candidate iterates over all refs to look for dual-mode regs. Instead this should be done separately for all regs mentioned in the chain once. */ df_ref ref; - df_ref def; for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) if (!HARD_REGISTER_P (DF_REF_REG (ref))) - for (def = DF_REG_DEF_CHAIN (DF_REF_REGNO (ref)); - def; - def = DF_REF_NEXT_REG (def)) - analyze_register_chain (candidates, def); + analyze_register_chain (candidates, ref); for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) if (!DF_REF_REG_MEM_P (ref)) analyze_register_chain (candidates, ref); @@ -605,42 +601,6 @@ general_scalar_chain::compute_convert_ga return gain; } -/* Replace REG in X with a V2DI subreg of NEW_REG. */ - -rtx -general_scalar_chain::replace_with_subreg (rtx x, rtx reg, rtx new_reg) -{ - if (x == reg) - return gen_rtx_SUBREG (vmode, new_reg, 0); - - /* But not in memory addresses. */ - if (MEM_P (x)) - return x; - - const char *fmt = GET_RTX_FORMAT (GET_CODE (x)); - int i, j; - for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--) - { - if (fmt[i] == 'e') - XEXP (x, i) = replace_with_subreg (XEXP (x, i), reg, new_reg); - else if (fmt[i] == 'E') - for (j = XVECLEN (x, i) - 1; j >= 0; j--) - XVECEXP (x, i, j) = replace_with_subreg (XVECEXP (x, i, j), - reg, new_reg); - } - - return x; -} - -/* Replace REG in INSN with a V2DI subreg of NEW_REG. */ - -void -general_scalar_chain::replace_with_subreg_in_insn (rtx_insn *insn, - rtx reg, rtx new_reg) -{ - replace_with_subreg (single_set (insn), reg, new_reg); -} - /* Insert generated conversion instruction sequence INSNS after instruction AFTER. New BB may be required in case instruction has EH region attached. */ @@ -691,204 +651,147 @@ general_scalar_chain::make_vector_copies rtx vreg = gen_reg_rtx (smode); df_ref ref; - for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref)) - if (!bitmap_bit_p (insns, DF_REF_INSN_UID (ref))) - { - start_sequence (); - if (!TARGET_INTER_UNIT_MOVES_TO_VEC) - { - rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP); - if (smode == DImode && !TARGET_64BIT) - { - emit_move_insn (adjust_address (tmp, SImode, 0), - gen_rtx_SUBREG (SImode, reg, 0)); - emit_move_insn (adjust_address (tmp, SImode, 4), - gen_rtx_SUBREG (SImode, reg, 4)); - } - else - emit_move_insn (copy_rtx (tmp), reg); - emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0), - gen_gpr_to_xmm_move_src (vmode, tmp))); - } - else if (!TARGET_64BIT && smode == DImode) - { - if (TARGET_SSE4_1) - { - emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), - CONST0_RTX (V4SImode), - gen_rtx_SUBREG (SImode, reg, 0))); - emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0), - gen_rtx_SUBREG (V4SImode, vreg, 0), - gen_rtx_SUBREG (SImode, reg, 4), - GEN_INT (2))); - } - else - { - rtx tmp = gen_reg_rtx (DImode); - emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), - CONST0_RTX (V4SImode), - gen_rtx_SUBREG (SImode, reg, 0))); - emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0), - CONST0_RTX (V4SImode), - gen_rtx_SUBREG (SImode, reg, 4))); - emit_insn (gen_vec_interleave_lowv4si - (gen_rtx_SUBREG (V4SImode, vreg, 0), - gen_rtx_SUBREG (V4SImode, vreg, 0), - gen_rtx_SUBREG (V4SImode, tmp, 0))); - } - } - else - emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0), - gen_gpr_to_xmm_move_src (vmode, reg))); - rtx_insn *seq = get_insns (); - end_sequence (); - rtx_insn *insn = DF_REF_INSN (ref); - emit_conversion_insns (seq, insn); - - if (dump_file) - fprintf (dump_file, - " Copied r%d to a vector register r%d for insn %d\n", - regno, REGNO (vreg), INSN_UID (insn)); - } - - for (ref = DF_REG_USE_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref)) - if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))) - { - rtx_insn *insn = DF_REF_INSN (ref); - replace_with_subreg_in_insn (insn, reg, vreg); - - if (dump_file) - fprintf (dump_file, " Replaced r%d with r%d in insn %d\n", - regno, REGNO (vreg), INSN_UID (insn)); - } -} - -/* Convert all definitions of register REGNO - and fix its uses. Scalar copies may be created - in case register is used in not convertible insn. */ - -void -general_scalar_chain::convert_reg (unsigned regno) -{ - bool scalar_copy = bitmap_bit_p (defs_conv, regno); - rtx reg = regno_reg_rtx[regno]; - rtx scopy = NULL_RTX; - df_ref ref; - bitmap conv; - - conv = BITMAP_ALLOC (NULL); - bitmap_copy (conv, insns); - - if (scalar_copy) - scopy = gen_reg_rtx (smode); + defs_map.put (reg, vreg); + /* For each insn defining REGNO, see if it is defined by an insn + not part of the chain but with uses in insns part of the chain + and insert a copy in that case. */ for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref)) { - rtx_insn *insn = DF_REF_INSN (ref); - rtx def_set = single_set (insn); - rtx src = SET_SRC (def_set); - rtx reg = DF_REF_REG (ref); + if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))) + continue; + df_link *use; + for (use = DF_REF_CHAIN (ref); use; use = use->next) + if (!DF_REF_REG_MEM_P (use->ref) + && bitmap_bit_p (insns, DF_REF_INSN_UID (use->ref))) + break; + if (!use) + continue; - if (!MEM_P (src)) + start_sequence (); + if (!TARGET_INTER_UNIT_MOVES_TO_VEC) { - replace_with_subreg_in_insn (insn, reg, reg); - bitmap_clear_bit (conv, INSN_UID (insn)); + rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP); + if (smode == DImode && !TARGET_64BIT) + { + emit_move_insn (adjust_address (tmp, SImode, 0), + gen_rtx_SUBREG (SImode, reg, 0)); + emit_move_insn (adjust_address (tmp, SImode, 4), + gen_rtx_SUBREG (SImode, reg, 4)); + } + else + emit_move_insn (copy_rtx (tmp), reg); + emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0), + gen_gpr_to_xmm_move_src (vmode, tmp))); } - - if (scalar_copy) + else if (!TARGET_64BIT && smode == DImode) { - start_sequence (); - if (!TARGET_INTER_UNIT_MOVES_FROM_VEC) + if (TARGET_SSE4_1) { - rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP); - emit_move_insn (tmp, reg); - if (!TARGET_64BIT && smode == DImode) - { - emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 0), - adjust_address (tmp, SImode, 0)); - emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 4), - adjust_address (tmp, SImode, 4)); - } - else - emit_move_insn (scopy, copy_rtx (tmp)); + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), + CONST0_RTX (V4SImode), + gen_rtx_SUBREG (SImode, reg, 0))); + emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0), + gen_rtx_SUBREG (V4SImode, vreg, 0), + gen_rtx_SUBREG (SImode, reg, 4), + GEN_INT (2))); } - else if (!TARGET_64BIT && smode == DImode) + else { - if (TARGET_SSE4_1) - { - rtx tmp = gen_rtx_PARALLEL (VOIDmode, - gen_rtvec (1, const0_rtx)); - emit_insn - (gen_rtx_SET - (gen_rtx_SUBREG (SImode, scopy, 0), - gen_rtx_VEC_SELECT (SImode, - gen_rtx_SUBREG (V4SImode, reg, 0), - tmp))); - - tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, const1_rtx)); - emit_insn - (gen_rtx_SET - (gen_rtx_SUBREG (SImode, scopy, 4), - gen_rtx_VEC_SELECT (SImode, - gen_rtx_SUBREG (V4SImode, reg, 0), - tmp))); - } - else - { - rtx vcopy = gen_reg_rtx (V2DImode); - emit_move_insn (vcopy, gen_rtx_SUBREG (V2DImode, reg, 0)); - emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 0), - gen_rtx_SUBREG (SImode, vcopy, 0)); - emit_move_insn (vcopy, - gen_rtx_LSHIFTRT (V2DImode, - vcopy, GEN_INT (32))); - emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 4), - gen_rtx_SUBREG (SImode, vcopy, 0)); - } + rtx tmp = gen_reg_rtx (DImode); + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), + CONST0_RTX (V4SImode), + gen_rtx_SUBREG (SImode, reg, 0))); + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0), + CONST0_RTX (V4SImode), + gen_rtx_SUBREG (SImode, reg, 4))); + emit_insn (gen_vec_interleave_lowv4si + (gen_rtx_SUBREG (V4SImode, vreg, 0), + gen_rtx_SUBREG (V4SImode, vreg, 0), + gen_rtx_SUBREG (V4SImode, tmp, 0))); } - else - emit_move_insn (scopy, reg); - - rtx_insn *seq = get_insns (); - end_sequence (); - emit_conversion_insns (seq, insn); - - if (dump_file) - fprintf (dump_file, - " Copied r%d to a scalar register r%d for insn %d\n", - regno, REGNO (scopy), INSN_UID (insn)); } - } - - for (ref = DF_REG_USE_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref)) - if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))) - { - if (bitmap_bit_p (conv, DF_REF_INSN_UID (ref))) - { - rtx_insn *insn = DF_REF_INSN (ref); + else + emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0), + gen_gpr_to_xmm_move_src (vmode, reg))); + rtx_insn *seq = get_insns (); + end_sequence (); + rtx_insn *insn = DF_REF_INSN (ref); + emit_conversion_insns (seq, insn); - rtx def_set = single_set (insn); - gcc_assert (def_set); + if (dump_file) + fprintf (dump_file, + " Copied r%d to a vector register r%d for insn %d\n", + regno, REGNO (vreg), INSN_UID (insn)); + } +} - rtx src = SET_SRC (def_set); - rtx dst = SET_DEST (def_set); +/* Copy the definition SRC of INSN inside the chain to DST for + scalar uses outside of the chain. */ - if (!MEM_P (dst) || !REG_P (src)) - replace_with_subreg_in_insn (insn, reg, reg); +void +general_scalar_chain::convert_reg (rtx_insn *insn, rtx dst, rtx src) +{ + start_sequence (); + if (!TARGET_INTER_UNIT_MOVES_FROM_VEC) + { + rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP); + emit_move_insn (tmp, src); + if (!TARGET_64BIT && smode == DImode) + { + emit_move_insn (gen_rtx_SUBREG (SImode, dst, 0), + adjust_address (tmp, SImode, 0)); + emit_move_insn (gen_rtx_SUBREG (SImode, dst, 4), + adjust_address (tmp, SImode, 4)); + } + else + emit_move_insn (dst, copy_rtx (tmp)); + } + else if (!TARGET_64BIT && smode == DImode) + { + if (TARGET_SSE4_1) + { + rtx tmp = gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (1, const0_rtx)); + emit_insn + (gen_rtx_SET + (gen_rtx_SUBREG (SImode, dst, 0), + gen_rtx_VEC_SELECT (SImode, + gen_rtx_SUBREG (V4SImode, src, 0), + tmp))); + + tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, const1_rtx)); + emit_insn + (gen_rtx_SET + (gen_rtx_SUBREG (SImode, dst, 4), + gen_rtx_VEC_SELECT (SImode, + gen_rtx_SUBREG (V4SImode, src, 0), + tmp))); + } + else + { + rtx vcopy = gen_reg_rtx (V2DImode); + emit_move_insn (vcopy, gen_rtx_SUBREG (V2DImode, src, 0)); + emit_move_insn (gen_rtx_SUBREG (SImode, dst, 0), + gen_rtx_SUBREG (SImode, vcopy, 0)); + emit_move_insn (vcopy, + gen_rtx_LSHIFTRT (V2DImode, + vcopy, GEN_INT (32))); + emit_move_insn (gen_rtx_SUBREG (SImode, dst, 4), + gen_rtx_SUBREG (SImode, vcopy, 0)); + } + } + else + emit_move_insn (dst, src); - bitmap_clear_bit (conv, INSN_UID (insn)); - } - } - /* Skip debug insns and uninitialized uses. */ - else if (DF_REF_CHAIN (ref) - && NONDEBUG_INSN_P (DF_REF_INSN (ref))) - { - gcc_assert (scopy); - replace_rtx (DF_REF_INSN (ref), reg, scopy); - df_insn_rescan (DF_REF_INSN (ref)); - } + rtx_insn *seq = get_insns (); + end_sequence (); + emit_conversion_insns (seq, insn); - BITMAP_FREE (conv); + if (dump_file) + fprintf (dump_file, + " Copied r%d to a scalar register r%d for insn %d\n", + REGNO (src), REGNO (dst), INSN_UID (insn)); } /* Convert operand OP in INSN. We should handle @@ -921,16 +824,6 @@ general_scalar_chain::convert_op (rtx *o } else if (REG_P (*op)) { - /* We may have not converted register usage in case - this register has no definition. Otherwise it - should be converted in convert_reg. */ - df_ref ref; - FOR_EACH_INSN_USE (ref, insn) - if (DF_REF_REGNO (ref) == REGNO (*op)) - { - gcc_assert (!DF_REF_CHAIN (ref)); - break; - } *op = gen_rtx_SUBREG (vmode, *op, 0); } else if (CONST_INT_P (*op)) @@ -980,6 +873,32 @@ general_scalar_chain::convert_insn (rtx_ rtx dst = SET_DEST (def_set); rtx subreg; + /* Generate copies for out-of-chain uses of defs. */ + for (df_ref ref = DF_INSN_DEFS (insn); ref; ref = DF_REF_NEXT_LOC (ref)) + if (bitmap_bit_p (defs_conv, DF_REF_REGNO (ref))) + { + df_link *use; + for (use = DF_REF_CHAIN (ref); use; use = use->next) + if (DF_REF_REG_MEM_P (use->ref) + || !bitmap_bit_p (insns, DF_REF_INSN_UID (use->ref))) + break; + if (use) + convert_reg (insn, DF_REF_REG (ref), + *defs_map.get (regno_reg_rtx [DF_REF_REGNO (ref)])); + } + + /* Replace uses in this insn with the defs we use in the chain. */ + for (df_ref ref = DF_INSN_USES (insn); ref; ref = DF_REF_NEXT_LOC (ref)) + if (!DF_REF_REG_MEM_P (ref)) + if (rtx *vreg = defs_map.get (regno_reg_rtx[DF_REF_REGNO (ref)])) + { + /* Also update a corresponding REG_DEAD note. */ + rtx note = find_reg_note (insn, REG_DEAD, DF_REF_REG (ref)); + if (note) + XEXP (note, 0) = *vreg; + *DF_REF_REAL_LOC (ref) = *vreg; + } + if (MEM_P (dst) && !REG_P (src)) { /* There are no scalar integer instructions and therefore @@ -988,6 +907,20 @@ general_scalar_chain::convert_insn (rtx_ emit_conversion_insns (gen_move_insn (dst, tmp), insn); dst = gen_rtx_SUBREG (vmode, tmp, 0); } + else if (REG_P (dst)) + { + /* Replace the definition with a SUBREG to the definition we + use inside the chain. */ + rtx *vdef = defs_map.get (dst); + if (vdef) + dst = *vdef; + dst = gen_rtx_SUBREG (vmode, dst, 0); + /* IRA doesn't like to have REG_EQUAL/EQUIV notes when the SET_DEST + is a non-REG_P. So kill those off. */ + rtx note = find_reg_equal_equiv_note (insn); + if (note) + remove_note (insn, note); + } switch (GET_CODE (src)) { @@ -1045,20 +978,15 @@ general_scalar_chain::convert_insn (rtx_ case COMPARE: src = SUBREG_REG (XEXP (XEXP (src, 0), 0)); - gcc_assert ((REG_P (src) && GET_MODE (src) == DImode) - || (SUBREG_P (src) && GET_MODE (src) == V2DImode)); - - if (REG_P (src)) - subreg = gen_rtx_SUBREG (V2DImode, src, 0); - else - subreg = copy_rtx_if_shared (src); + gcc_assert (REG_P (src) && GET_MODE (src) == DImode); + subreg = gen_rtx_SUBREG (V2DImode, src, 0); emit_insn_before (gen_vec_interleave_lowv2di (copy_rtx_if_shared (subreg), copy_rtx_if_shared (subreg), copy_rtx_if_shared (subreg)), insn); dst = gen_rtx_REG (CCmode, FLAGS_REG); - src = gen_rtx_UNSPEC (CCmode, gen_rtvec (2, copy_rtx_if_shared (src), - copy_rtx_if_shared (src)), + src = gen_rtx_UNSPEC (CCmode, gen_rtvec (2, copy_rtx_if_shared (subreg), + copy_rtx_if_shared (subreg)), UNSPEC_PTEST); break; @@ -1217,16 +1145,15 @@ timode_scalar_chain::convert_insn (rtx_i df_insn_rescan (insn); } +/* Generate copies from defs used by the chain but not defined therein. + Also populates defs_map which is used later by convert_insn. */ + void general_scalar_chain::convert_registers () { bitmap_iterator bi; unsigned id; - - EXECUTE_IF_SET_IN_BITMAP (defs, 0, id, bi) - convert_reg (id); - - EXECUTE_IF_AND_COMPL_IN_BITMAP (defs_conv, defs, 0, id, bi) + EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, id, bi) make_vector_copies (id); } Index: gcc/config/i386/i386-features.h =================================================================== --- gcc/config/i386/i386-features.h (revision 274843) +++ gcc/config/i386/i386-features.h (working copy) @@ -171,12 +171,11 @@ class general_scalar_chain : public scal : scalar_chain (smode_, vmode_) {} int compute_convert_gain (); private: + hash_map<rtx, rtx> defs_map; void mark_dual_mode_def (df_ref def); - rtx replace_with_subreg (rtx x, rtx reg, rtx subreg); - void replace_with_subreg_in_insn (rtx_insn *insn, rtx reg, rtx subreg); void convert_insn (rtx_insn *insn); void convert_op (rtx *op, rtx_insn *insn); - void convert_reg (unsigned regno); + void convert_reg (rtx_insn *insn, rtx dst, rtx src); void make_vector_copies (unsigned regno); void convert_registers (); int vector_const_cost (rtx exp);