Hi, I'd like to gentle ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html BR, Kewen >> on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> By addressing Alexander's comments, against v1 this >>> patch v2 mainly: >>> >>> - Rename no_real_insns_p to no_real_nondebug_insns_p; >>> - Introduce enum rgn_bb_deps_free_action for three >>> kinds of actions to free deps; >>> - Change function free_deps_for_bb_no_real_insns_p to >>> resolve_forw_deps which only focuses on forward deps; >>> - Extend the handlings to cover dbg-cnt sched_block, >>> add one test case for it; >>> - Move free_trg_info call in schedule_region to an >>> appropriate place. >>> >>> One thing I'm not sure about is the change in function >>> sched_rgn_local_finish, currently the invocation to >>> sched_rgn_local_free is guarded with !sel_sched_p (), >>> so I just follow it, but the initialization of those >>> structures (in sched_rgn_local_init) isn't guarded >>> with !sel_sched_p (), it looks odd. >>> >>> ---- >>> >>> As PR108273 shows, when there is one block which only has >>> NOTE_P and LABEL_P insns at non-debug mode while has some >>> extra DEBUG_INSN_P insns at debug mode, after scheduling >>> it, the DFA states would be different between debug mode >>> and non-debug mode. Since at non-debug mode, the block >>> meets no_real_insns_p, it gets skipped; while at debug >>> mode, it gets scheduled, even it only has NOTE_P, LABEL_P >>> and DEBUG_INSN_P, the call of function advance_one_cycle >>> will change the DFA state. PR108519 also shows this issue >>> issue can be exposed by some scheduler changes. >>> >>> This patch is to change function no_real_insns_p to >>> function no_real_nondebug_insns_p by taking debug insn into >>> account, which make us not try to schedule for the block >>> having only NOTE_P, LABEL_P and DEBUG_INSN_P insns, >>> resulting in consistent DFA states between non-debug and >>> debug mode. >>> >>> Changing no_real_insns_p to no_real_nondebug_insns_p caused >>> ICE when doing free_block_dependencies, the root cause is >>> that we create dependencies for debug insns, those >>> dependencies are expected to be resolved during scheduling >>> insns, but which gets skipped after this change. >>> By checking the code, it looks it's reasonable to skip to >>> compute block dependences for no_real_nondebug_insns_p >>> blocks. There is also another issue, which gets exposed >>> in SPEC2017 bmks build at option -O2 -g, is that we could >>> skip to schedule some block, which already gets dependency >>> graph built so has dependencies computed and rgn_n_insns >>> accumulated, then the later verification on if the graph >>> becomes exhausted by scheduling would fail as follow: >>> >>> /* Sanity check: verify that all region insns were >>> scheduled. */ >>> gcc_assert (sched_rgn_n_insns == rgn_n_insns); >>> >>> , and also some forward deps aren't resovled. >>> >>> As Alexander pointed out, the current debug count handling >>> also suffers the similar issue, so this patch handles these >>> two cases together: one is for some block gets skipped by >>> !dbg_cnt (sched_block), the other is for some block which >>> is not no_real_nondebug_insns_p initially but becomes >>> no_real_nondebug_insns_p due to speculative scheduling. >>> >>> This patch can be bootstrapped and regress-tested on >>> x86_64-redhat-linux, aarch64-linux-gnu and >>> powerpc64{,le}-linux-gnu. >>> >>> I also verified this patch can pass SPEC2017 both intrate >>> and fprate bmks building at -g -O2/-O3. >>> >>> Any thoughts? >>> >>> BR, >>> Kewen >>> ---- >>> PR rtl-optimization/108273 >>> >>> gcc/ChangeLog: >>> >>> * haifa-sched.cc (no_real_insns_p): Rename to ... >>> (no_real_nondebug_insns_p): ... this, and consider DEBUG_INSN_P insn. >>> * sched-ebb.cc (schedule_ebb): Replace no_real_insns_p with >>> no_real_nondebug_insns_p. >>> * sched-int.h (no_real_insns_p): Rename to ... >>> (no_real_nondebug_insns_p): ... this. >>> * sched-rgn.cc (enum rgn_bb_deps_free_action): New enum. >>> (bb_deps_free_actions): New static variable. >>> (compute_block_dependences): Skip for no_real_nondebug_insns_p. >>> (resolve_forw_deps): New function. >>> (free_block_dependencies): Check bb_deps_free_actions and call >>> function resolve_forw_deps for RGN_BB_DEPS_FREE_ARTICIAL. >>> (compute_priorities): Replace no_real_insns_p with >>> no_real_nondebug_insns_p. >>> (schedule_region): Replace no_real_insns_p with >>> no_real_nondebug_insns_p, set RGN_BB_DEPS_FREE_ARTICIAL if the block >>> get dependencies computed before but skipped now, fix up count >>> sched_rgn_n_insns for it too. Call free_trg_info when the block >>> gets scheduled, and move sched_rgn_local_finish after the loop >>> of free_block_dependencies loop. >>> (sched_rgn_local_init): Allocate and compute bb_deps_free_actions. >>> (sched_rgn_local_finish): Free bb_deps_free_actions. >>> * sel-sched.cc (sel_region_target_finish): Replace no_real_insns_p with >>> no_real_nondebug_insns_p. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/powerpc/pr108273.c: New test. >>> --- >>> gcc/haifa-sched.cc | 9 +- >>> gcc/sched-ebb.cc | 2 +- >>> gcc/sched-int.h | 2 +- >>> gcc/sched-rgn.cc | 148 +++++++++++++++----- >>> gcc/sel-sched.cc | 3 +- >>> gcc/testsuite/gcc.target/powerpc/pr108273.c | 26 ++++ >>> 6 files changed, 150 insertions(+), 40 deletions(-) >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108273.c >>> >>> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc >>> index 48b53776fa9..371de486eb0 100644 >>> --- a/gcc/haifa-sched.cc >>> +++ b/gcc/haifa-sched.cc >>> @@ -5033,14 +5033,17 @@ get_ebb_head_tail (basic_block beg, basic_block end, >>> *tailp = end_tail; >>> } >>> >>> -/* Return nonzero if there are no real insns in the range [ HEAD, TAIL ]. >>> */ >>> +/* Return nonzero if there are no real nondebug insns in the range >>> + [ HEAD, TAIL ]. */ >>> >>> int >>> -no_real_insns_p (const rtx_insn *head, const rtx_insn *tail) >>> +no_real_nondebug_insns_p (const rtx_insn *head, const rtx_insn *tail) >>> { >>> while (head != NEXT_INSN (tail)) >>> { >>> - if (!NOTE_P (head) && !LABEL_P (head)) >>> + if (!NOTE_P (head) >>> + && !LABEL_P (head) >>> + && !DEBUG_INSN_P (head)) >>> return 0; >>> head = NEXT_INSN (head); >>> } >>> diff --git a/gcc/sched-ebb.cc b/gcc/sched-ebb.cc >>> index 3eb6a24f187..e3bb0d57159 100644 >>> --- a/gcc/sched-ebb.cc >>> +++ b/gcc/sched-ebb.cc >>> @@ -491,7 +491,7 @@ schedule_ebb (rtx_insn *head, rtx_insn *tail, bool >>> modulo_scheduling) >>> first_bb = BLOCK_FOR_INSN (head); >>> last_bb = BLOCK_FOR_INSN (tail); >>> >>> - if (no_real_insns_p (head, tail)) >>> + if (no_real_nondebug_insns_p (head, tail)) >>> return BLOCK_FOR_INSN (tail); >>> >>> gcc_assert (INSN_P (head) && INSN_P (tail)); >>> diff --git a/gcc/sched-int.h b/gcc/sched-int.h >>> index 97b7d2d319b..eb4e8acec9f 100644 >>> --- a/gcc/sched-int.h >>> +++ b/gcc/sched-int.h >>> @@ -1397,7 +1397,7 @@ extern void free_global_sched_pressure_data (void); >>> extern int haifa_classify_insn (const_rtx); >>> extern void get_ebb_head_tail (basic_block, basic_block, >>> rtx_insn **, rtx_insn **); >>> -extern int no_real_insns_p (const rtx_insn *, const rtx_insn *); >>> +extern int no_real_nondebug_insns_p (const rtx_insn *, const rtx_insn *); >>> >>> extern int insn_sched_cost (rtx_insn *); >>> extern int dep_cost_1 (dep_t, dw_t); >>> diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc >>> index f2751f62450..3508a26aeb6 100644 >>> --- a/gcc/sched-rgn.cc >>> +++ b/gcc/sched-rgn.cc >>> @@ -213,6 +213,22 @@ static int rgn_nr_edges; >>> /* Array of size rgn_nr_edges. */ >>> static edge *rgn_edges; >>> >>> +/* Possible actions for dependencies freeing. */ >>> +enum rgn_bb_deps_free_action >>> +{ >>> + /* This block doesn't get dependencies computed so don't need to free. >>> */ >>> + RGN_BB_DEPS_FREE_NO, >>> + /* This block gets scheduled normally so free dependencies as usual. */ >>> + RGN_BB_DEPS_FREE_NORMAL, >>> + /* This block gets skipped in scheduling but has dependencies computed >>> early, >>> + need to free the forward list articially. */ >>> + RGN_BB_DEPS_FREE_ARTICIAL >>> +}; >>> + >>> +/* For basic block i, bb_deps_free_actions[i] indicates which action needs >>> + to be taken for freeing its dependencies. */ >>> +static enum rgn_bb_deps_free_action *bb_deps_free_actions; >>> + >>> /* Mapping from each edge in the graph to its number in the rgn. */ >>> #define EDGE_TO_BIT(edge) ((int)(size_t)(edge)->aux) >>> #define SET_EDGE_TO_BIT(edge,nr) ((edge)->aux = (void *)(size_t)(nr)) >>> @@ -2730,6 +2746,15 @@ compute_block_dependences (int bb) >>> gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); >>> get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); >>> >>> + /* Don't compute block dependencies if there are no real nondebug insns. >>> */ >>> + if (no_real_nondebug_insns_p (head, tail)) >>> + { >>> + if (current_nr_blocks > 1) >>> + propagate_deps (bb, &tmp_deps); >>> + free_deps (&tmp_deps); >>> + return; >>> + } >>> + >>> sched_analyze (&tmp_deps, head, tail); >>> >>> add_branch_dependences (head, tail); >>> @@ -2744,6 +2769,24 @@ compute_block_dependences (int bb) >>> targetm.sched.dependencies_evaluation_hook (head, tail); >>> } >>> >>> +/* Artificially resolve forward dependencies for instructions HEAD to >>> TAIL. */ >>> + >>> +static void >>> +resolve_forw_deps (rtx_insn *head, rtx_insn *tail) >>> +{ >>> + rtx_insn *insn; >>> + rtx_insn *next_tail = NEXT_INSN (tail); >>> + sd_iterator_def sd_it; >>> + dep_t dep; >>> + >>> + /* There could be some insns which get skipped in scheduling but we >>> compute >>> + dependencies for them previously, so make them resolved. */ >>> + for (insn = head; insn != next_tail; insn = NEXT_INSN (insn)) >>> + for (sd_it = sd_iterator_start (insn, SD_LIST_FORW); >>> + sd_iterator_cond (&sd_it, &dep);) >>> + sd_resolve_dep (sd_it); >>> +} >>> + >>> /* Free dependencies of instructions inside BB. */ >>> static void >>> free_block_dependencies (int bb) >>> @@ -2753,9 +2796,12 @@ free_block_dependencies (int bb) >>> >>> get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); >>> >>> - if (no_real_insns_p (head, tail)) >>> + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO) >>> return; >>> >>> + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTICIAL) >>> + resolve_forw_deps (head, tail); >>> + >>> sched_free_deps (head, tail, true); >>> } >>> >>> @@ -3019,7 +3065,7 @@ compute_priorities (void) >>> gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); >>> get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, >>> &tail); >>> >>> - if (no_real_insns_p (head, tail)) >>> + if (no_real_nondebug_insns_p (head, tail)) >>> continue; >>> >>> rgn_n_insns += set_priorities (head, tail); >>> @@ -3153,7 +3199,7 @@ schedule_region (int rgn) >>> >>> get_ebb_head_tail (first_bb, last_bb, &head, &tail); >>> >>> - if (no_real_insns_p (head, tail)) >>> + if (no_real_nondebug_insns_p (head, tail)) >>> { >>> gcc_assert (first_bb == last_bb); >>> continue; >>> @@ -3173,44 +3219,62 @@ schedule_region (int rgn) >>> >>> get_ebb_head_tail (first_bb, last_bb, &head, &tail); >>> >>> - if (no_real_insns_p (head, tail)) >>> + if (no_real_nondebug_insns_p (head, tail)) >>> { >>> gcc_assert (first_bb == last_bb); >>> save_state_for_fallthru_edge (last_bb, bb_state[first_bb->index]); >>> - continue; >>> + >>> + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_NO) >>> + continue; >>> + >>> + /* As it's not no_real_nondebug_insns_p initially, then it has some >>> + dependencies computed so free it articially. */ >>> + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTICIAL; >>> } >>> + else >>> + { >>> + current_sched_info->prev_head = PREV_INSN (head); >>> + current_sched_info->next_tail = NEXT_INSN (tail); >>> >>> - current_sched_info->prev_head = PREV_INSN (head); >>> - current_sched_info->next_tail = NEXT_INSN (tail); >>> + remove_notes (head, tail); >>> >>> - remove_notes (head, tail); >>> + unlink_bb_notes (first_bb, last_bb); >>> >>> - unlink_bb_notes (first_bb, last_bb); >>> + target_bb = bb; >>> >>> - target_bb = bb; >>> + gcc_assert (flag_schedule_interblock || current_nr_blocks == 1); >>> + current_sched_info->queue_must_finish_empty = current_nr_blocks == 1; >>> >>> - gcc_assert (flag_schedule_interblock || current_nr_blocks == 1); >>> - current_sched_info->queue_must_finish_empty = current_nr_blocks == 1; >>> + curr_bb = first_bb; >>> + if (dbg_cnt (sched_block)) >>> + { >>> + int saved_last_basic_block = last_basic_block_for_fn (cfun); >>> >>> - curr_bb = first_bb; >>> - if (dbg_cnt (sched_block)) >>> - { >>> - int saved_last_basic_block = last_basic_block_for_fn (cfun); >>> + schedule_block (&curr_bb, bb_state[first_bb->index]); >>> + gcc_assert (EBB_FIRST_BB (bb) == first_bb); >>> + sched_rgn_n_insns += sched_n_insns; >>> + realloc_bb_state_array (saved_last_basic_block); >>> + save_state_for_fallthru_edge (last_bb, curr_state); >>> >>> - schedule_block (&curr_bb, bb_state[first_bb->index]); >>> - gcc_assert (EBB_FIRST_BB (bb) == first_bb); >>> - sched_rgn_n_insns += sched_n_insns; >>> - realloc_bb_state_array (saved_last_basic_block); >>> - save_state_for_fallthru_edge (last_bb, curr_state); >>> - } >>> - else >>> - { >>> - sched_rgn_n_insns += rgn_n_insns; >>> - } >>> + /* Clean up. */ >>> + if (current_nr_blocks > 1) >>> + free_trg_info (); >>> + } >>> + else >>> + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_ARTICIAL; >>> + } >>> >>> - /* Clean up. */ >>> - if (current_nr_blocks > 1) >>> - free_trg_info (); >>> + /* We have counted this block when computing rgn_n_insns >>> + previously, so need to fix up sched_rgn_n_insns now. */ >>> + if (bb_deps_free_actions[bb] == RGN_BB_DEPS_FREE_ARTICIAL) >>> + { >>> + while (head != NEXT_INSN (tail)) >>> + { >>> + if (INSN_P (head)) >>> + sched_rgn_n_insns++; >>> + head = NEXT_INSN (head); >>> + } >>> + } >>> } >>> >>> /* Sanity check: verify that all region insns were scheduled. */ >>> @@ -3218,13 +3282,13 @@ schedule_region (int rgn) >>> >>> sched_finish_ready_list (); >>> >>> - /* Done with this region. */ >>> - sched_rgn_local_finish (); >>> - >>> /* Free dependencies. */ >>> for (bb = 0; bb < current_nr_blocks; ++bb) >>> free_block_dependencies (bb); >>> >>> + /* Done with this region. */ >>> + sched_rgn_local_finish (); >>> + >>> gcc_assert (haifa_recovery_bb_ever_added_p >>> || deps_pools_are_empty_p ()); >>> } >>> @@ -3445,6 +3509,19 @@ sched_rgn_local_init (int rgn) >>> e->aux = NULL; >>> } >>> } >>> + >>> + /* Initialize bb_deps_free_actions. */ >>> + bb_deps_free_actions >>> + = XNEWVEC (enum rgn_bb_deps_free_action, current_nr_blocks); >>> + for (bb = 0; bb < current_nr_blocks; bb++) >>> + { >>> + rtx_insn *head, *tail; >>> + get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, >>> &tail); >>> + if (no_real_nondebug_insns_p (head, tail)) >>> + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NO; >>> + else >>> + bb_deps_free_actions[bb] = RGN_BB_DEPS_FREE_NORMAL; >>> + } >>> } >>> >>> /* Free data computed for the finished region. */ >>> @@ -3462,9 +3539,12 @@ sched_rgn_local_free (void) >>> void >>> sched_rgn_local_finish (void) >>> { >>> - if (current_nr_blocks > 1 && !sel_sched_p ()) >>> + if (!sel_sched_p ()) >>> { >>> - sched_rgn_local_free (); >>> + if (current_nr_blocks > 1) >>> + sched_rgn_local_free (); >>> + >>> + free (bb_deps_free_actions); >>> } >>> } >>> >>> diff --git a/gcc/sel-sched.cc b/gcc/sel-sched.cc >>> index cb1cf75bfe4..04415590455 100644 >>> --- a/gcc/sel-sched.cc >>> +++ b/gcc/sel-sched.cc >>> @@ -7213,7 +7213,8 @@ sel_region_target_finish (bool reset_sched_cycles_p) >>> >>> find_ebb_boundaries (EBB_FIRST_BB (i), scheduled_blocks); >>> >>> - if (no_real_insns_p (current_sched_info->head, >>> current_sched_info->tail)) >>> + if (no_real_nondebug_insns_p (current_sched_info->head, >>> + current_sched_info->tail)) >>> continue; >>> >>> if (reset_sched_cycles_p) >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108273.c >>> b/gcc/testsuite/gcc.target/powerpc/pr108273.c >>> new file mode 100644 >>> index 00000000000..937224eaa69 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr108273.c >>> @@ -0,0 +1,26 @@ >>> +/* { dg-options "-O2 -fdbg-cnt=sched_block:1" } */ >>> +/* { dg-prune-output {\*\*\*dbgcnt:.*limit.*reached} } */ >>> + >>> +/* Verify there is no ICE. */ >>> + >>> +int a, b, c, e, f; >>> +float d; >>> + >>> +void >>> +g () >>> +{ >>> + float h, i[1]; >>> + for (; f;) >>> + if (c) >>> + { >>> + d *e; >>> + if (b) >>> + { >>> + float *j = i; >>> + j[0] += 0; >>> + } >>> + h += d; >>> + } >>> + if (h) >>> + a = i[0]; >>> +} >>> -- >>> 2.25.1 >> BR, Kewen