On Tue, Feb 16, 2021 at 09:42:22AM +0100, Richard Biener wrote: > Just to get an idea whether it's worth doing the extra df_analyze. > Since we have possibly 5 split passes it's a lot of churn for things > like that WRF ltrans unit that already spends 40% of its time in DF ...
Yeah, df_analyze can be fairly expensive and most of the targets don't really need it at all. If I grep for df_get.*_out, I find: config/aarch64/aarch64.c: bitmap live1 = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)); config/arc/arc.c: && REGNO_REG_SET_P (df_get_live_out (loop->incoming_src), config/arm/arm.c: bitmap prologue_live_out = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)); config/arm/arm.c: return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 3); config/arm/arm.c: = REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), config/arm/arm.c: = REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), config/bfin/bfin.c: && !REGNO_REG_SET_P (df_get_live_out (bb_in), i)) config/i386/i386.c: return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 0); config/i386/i386.c: live = df_get_live_out(bb); config/i386/i386-features.c: bitmap_copy (live_regs, df_get_live_out (bb)); where aarch64, arc and bfin are ok, i386 has this known issue and arm uses most of the calls during pro/epilogue expansion (fine), but seems to use it also (indirectly) in USE_RETURN_INSN which is used not in splitters, but in insn conditions (so matched at any time). But it seems to use a cache so that once computed it remembers it, so probably it is ok too. So, adding unconditional df_analyze / TODO_df_finish would slow down all the targets, but only help a single one and even on that one it is better to do it only if it will be really needed (e.g. it is never needed during split1 (!reload_completed doesn't call those at all) and even in split2+ one really needs to trigger the right patterns in the IL (it can trigger quite frequently with the atom/bonell tunings, but otherwise only very rarely). Jakub