https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85103
--- Comment #21 from Pat Haugen <pthaugen at gcc dot gnu.org> --- > Knowing what inline decision matters for VPR, I can try to fix it too. Gathering some perf data, the hot functions for various revisions are as follows. All other functions report < 0.5% of execution time. r257581 ------- samples % image name symbol name 577871 57.8700 vpr_base.temp_32 try_route 402207 40.2784 vpr_base.temp_32 get_heap_head r257582 ------- samples % image name symbol name 428249 40.9911 vpr_base.pat_test_32 try_route 402768 38.5521 vpr_base.pat_test_32 get_heap_head 189358 18.1249 vpr_base.pat_test_32 node_to_heap.part.0 r267727 (after patches that fixed bzip2 went in) ------- samples % image name symbol name 493998 45.9797 vpr_base.pat_base_32 try_route 416389 38.7561 vpr_base.pat_base_32 get_heap_head 140727 13.0984 vpr_base.pat_base_32 add_to_heap So from the above we can see that r257582 stopped inlining node_to_heap() into try_route(). In r267727, node_to_heap() is again being inlined into try_route(), but add_to_heap() is no longer inlined into node_to_heap(), which is the only caller of add_to_heap(). So it appears the needed inlining is getting the chain node_to_heap()->add_to_heap() to both get inlined into try_route again.